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Editorial 

Eureka 63 


I n the long and esteemed tradition of Eureka, I would like to start 
by apologising for the lateness of publication this year. Various 
organisational issues outside of our control (acts of God, one 
might even say) have set us back a few months. Despite this, it has 
been a real pleasure to edit Eureka for the hrst time, and I hope that 
I have been able to do the fantastic journal justice. 

This year, our focus on the editorial committee has been to increase 
the quantity of original articles from well-known professional math- 
ematicians, to add to the usual set of brilliant articles from students 
here in Cambridge. This has been largely successful, and this issue 
has contributions from many £ big names’ Throughout the editing 
process, weve also made efforts to prioritise the accessibility of ar- 
ticles, and to keep the mathematics readable. With that being said, 
however, the wide variety of topics and approaches should mean 
that there is something for everyone, and we’ve marked more tech- 
nical articles with stars in the contents. 

Over the next year, we plan to resume the publication of Qarch, our 
problems journal. We intend to publish it more frequently than has 
previously been the case, although the exact frequency and medium 
remain under discussion. We would be very interested to hear of 
any potential contributors. 

I’ve had the privilege of working with an excellent editorial team, 
who I would like to thank for all of their hard work. I would also like 
to thank former editor Philipp Legner for his invaluable advice and 
support, as well as our writers, our sponsors, the Archimedeans, and 
our readers. To run the risk of sounding clichĕ - without you none of 
this would have been possible. I hope you enjoy reading Eureka 63! 



Jasper Bird 
Editor, 2013 


Editor 

Josper Bird (Clare) 

Assistant Editors 

Corolyn Borker (Oueens') 
DovidSzobo (Churchill) 

Diono Donciu (Murroy Edwords) 
Jock Willioms (Clore) 

Kotorzyno Kowol (Churchill) 
Michoel Groyling (SidneySussex) 
Yonitso Pehovo (Murroy Edwords) 

Subscriptions 

Jocguie Hu (Jesus) 


1 




C 0KT3NT2 


100 

The Archimedeans 

110 

On Mathematical Method 
and Mathematical Proof 

ProfReuben Hersh 

1010 

The Fundamental 
Theorem of Algebra 

Proflan StewartFRS 

1110 

What doesGravity 
look like? 

Robert Hocking 

10100 

Sums, Products and 
Sums-and-Products 

Proflmre Leader 

10110 

The Axiom of Choice 

Robin Elliott 

11010 

Spot lt!® Solitaire 

Dr Donna Dietz 


100000 Chains Between 

Prisoners 

ProfKarl Sigmund and ProfChristian Hilbe 

100100 

Archimedeans' Annual 
Problems Drive 

101000 

n!-FortyYears on 

DrStephen Castelland 
Ms Forough Khaleghpour 

101010 

Image Restoration 

Dr Carola-Bibiane Schonlieb 

110000 

Stati illy Speaking 

ProfJohn Aston 

110100 

Erdos' Favourite 
Theorem of Polya 

Yanitsa Pehova 

111000 

High-Dimensional 
Dal and the Lasso 

Rajen Shah 


10 




111100 

xies Without 
Dark Matter 

Indranil Banik 

1000010 

The Death of a 
Mathematician 

Dr Mario Livio 

1000110 

The Mandelbrot Set 

NikolaosAthanasiou 

1001010 

Geometry Through 
the Eyes of Physics 

ProfDavid Tong 

1001110 

Howto Build 
the Perfect Igloo 

Andrzej Odrzywolek 

1010000 

500 Years of 
Mathematical 
Anniversaries 

1010010 

The Mathematics of 
Pointless 

ProfYigal Gerchak 

1010100 

Computable 

Functions 

Marc Khoury 

A Biffbmial Identity 

Ddvid Szabo 

1011000 


1011110 

Turing Instabilities 

Diana Danciu 

1100000 

A Nice Theorem in 

Multiplicative Functions 

Masum Billal 

1100010 

The Disc Planimeter 

Dr Gonzalo Gomez-Mataix 

1100110 

Stochastic 
Modelling of 
Biological Systems 

Michael Grayling 

1101010 

Generalising the 
Division Algorithm 

Samin Riasat 

1101100 

Mimimum Clues: 
Sudoku and Sudokion 

Stephen Jones 

1110000 

GetYour GeekOn! 

1110010 

Funny Section 

1110100 

Lecturer Reviews 

1110110 

Solutions to the 
Problems Drive 

1110111 

Copyright Notices 


11 




The Archimedeans 

James Bell, President 2013 - 2014 


T he Archimedeans have delivered another 
year of social and mathematical events 
to our members. We have welcomed al- 
most 200 new members and held a variety of 
events. Our speakers this year have included 
Sir Michael Atiyah, Simon Singh and Sir Rog- 
er Penrose, amongst many others, providing 
a talk for almost every Friday of Michaelmas 
and Lent. Topics ranged from numerical anal- 
ysis to number theory and from geometry to 
The Simpsons! 

We have had another successful annual dinner. 
This year, for the first time, the venue was Dou- 
ble Tree by Hilton on the bank of the Cam, and 
also for the first time we invited a fair number 
of members of the faculty to dine with us. 

The Archimedeans' Problems Drive went 
ahead as ever with many teams and a set of 
questions which can be found in this journal. 

The Committee 2013 - 2014 

President 

Jomes Bell (Gonville ond Coius) 

Vice-President 

Dono Mo (Newnhom) 

Corporate Officer 

Maithra Roghu (Trinity) 

Secretary 

Doochen Wong (Sidney Sussex) 


The competition was very tight and in the end 
went down to a tie breaking game of rock paper 
scissors! 

We once again participated in the Science 
societies’ garden party, and again provided 
an abundance of cheese to a party of various 
foods, drinks and jazz. 

In Michaelmas Term we had our usual Fresh- 
ers' squash with plenty of free pizza and a talk 
from Prof David Tong. We also extended our 
use of pizza to our board games night, which 
was plenty of fun for the large number of peo- 
ple who turned up. 

I hope that this edition of Eureka is of great 
interest to you and that our body of members 
and the Archimedeans can continue to thrive 
for many years yet. 


Treasurer 

Rowon Purvis (Jesus) 

Events Managers 

LukoszSegiet (St Cotherine's) 

Publicity Officer 

EmilyBoin (Emmonuel) 

Webmaster 

Kotorzyno Kowol (Churchill) 
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E yerybody knows that the sum of the hrst n 
integers is n(n + l)/2. Theres an old anec- 
dote about Carl Friedrich Gauss as a little 
boy astonishing his schoolmaster by summing 
from 1 to 100 and getting 5,050. 

What about the sum of the hrst n squares, or the 
hrst n cubes? You probably met these in high- 
school, required to discover the formulas by intel- 
i ligent guessing, and then to prove them by math- 
ematical induction. The sum of the cubes is easy, 
because you get 1, 9, 36, 100 as the sum of the 
hrst, the hrst two, the hrst three, and the hrst four 
cubes. You can t help noticing that these numbers 
are the squares of the sums of the hrst, the hrst 
two, the hrst three and the hrst four integers. You 
easily write down the general case and prove it by 
induction. The sum of the hrst n squares is a bit 
more trouble, but you can do it. In less than half 
an hour. Work it out, just for fun. It comes out as 
n(n + 1)(2 n + l)/6. Not as beautiful as the sum 
of n cubes. 

Somehow, nobody ever bothers with the sums of 
fourth, or fifth, or sixth powers. It doesnt seem 
that awfully interesting to just keep going. It 
would be worth the trouble for you to hgure out 
how, if you wanted to, you could actually obtain 
formulas for the sums of higher powers, one at a 
time. The higher you go, the more trouble it would 
be; youd hnd that you were solving systems of 
n + 1 linear equations. You could write a comput- 
er program to print out all the sum formulas up to 
n = 500, p = 500. Forget that, we re not getting into 
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that sort of thing in this article. 

What you really want is a general formula for the I 
sum of the pth powers of the hrst n integers, for I 
any positive integers n andp. Knowing it for p = 1, I 
2 and 3, or even forp = 1,2, 3,4, 5 doesnt seem to | 
get you close to a general formula! 

Experimentation 

Let s lay our information out in a little table. As we 1 
move to the right, we add one more term to the I 
sum. The hrst row is the sum of the hrst powers, I 
the second row is the sum of the squares, and so I 
on. Each row starts with 0, because you start with | 
nothing, and then add terms one at a time. 


n 

P 

0 

1 

2 

3 

4 


1 

0 

1 

3 

6 

10 


2 

0 

1 

5 

14 

30 


3 

0 

1 

9 

36 

100 


4 

0 

1 

17 












Figure 1 Table showing sums of p th powers up to n p 


The table can go on and on, to the right and down- 
wards, but what would we learn from that? At this 
point there are two options. You can just quit. The 
hell with it, this isn t getting anywhere! 
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The other option is better. It seems going to the 
right any further is a waste of time. What else 
can you do? Go to the left! Instead of adding 1 
and moving right in the hrst row, subtract 1 and 
move left! Instead of adding successive integers 
and moving right in the second row, subtract 
them and move left! That will mean subtracting 
negative numbers, once you go to the left of zero. 
Instead of adding squares of integers in the third 
row, subtract them! “What for?” you say. “Why 
bother?” Thats not thinking like a mathemati- 
cian. A mathematician is curious. He/she wants 
to know more, to understand more. Do it, just to 
see what happens! I am resisting the temptation 
to do it for you. It s easy enough, you must have a 
pencil and paper close by. 



Figure 2 Draw your own! 


The expanded rows now have zeroes in the mid- 
dle, and stretch out in both directions. The result 
turns out to be surprisingly simple! There are two 
opposite cases - the even powers and the odd 
powers. In each case, there is symmetry around 
a midpoint. The midpoint is halfway between 0 
and -1. The midpoint is at -1/2. Bit of a shock, 
that! And around that midpoint, the odd powers 
have even symmetry (they are equal on the right 
and the left). The even powers have odd symme- 
try (they are negatives of each other on the right 
and the left). All this is apparent, as soon as you 
extend the rows to the left by subtraction. 

Now a little baby algebra comes in handy. By shift- 
ing the variable n, it is evident that the sums of 
odd powers are even hmctions of (n + 1/2), and 
the sums of even powers are odd hmctions of 
(n + 1/2). Moreover, the sum of the p th powers, 


p being even or odd, is a polynomial of degree 
p + 1. Therefore, these sums are respectively, sums 
of just even or odd powers of (n + 1/2). 

Does that work for the examples we started with, 
for p = 1, 2 and 3? You can check it in a few min- 
utes, that last statement does hold for the expres- 
sions we obtained in the hrst few lines of this piece. 
Just completing the square is all it takes. 

It is fascinating to learn that back in 1615, in the 
time of Pascal, before Newton and Leibniz, an ob- 
scure German city othcial liked playing with this 
kind of algebra. Johannes Faulhaber was his name. 
He published a long-forgotten little pamphlet 
where he actually represented the sums of pow- 
ers of integers as sums of even or odd powers of 
(n + 1/2). How did he know? 

Theres more. Once youve gone this far, you can 
easily conclude that the sum of the p th powers, for 
all p odd (not just p = 3) is a polynomial in that 
hrst little expression, n(n + l)/2. And for allp even 
(not justp = 2) it is equal to 2n + 1 times a polyno- 
mial in that hrst little expression. 

Rigorization 

But my title promised you some methodology. 
Lets get to that. We have our results. But where 
are the proofs? Where are the axioms, that are 
supposed to be the hrst line of the proof? We 
never wrote a single axiom! Does that mean we 
never proved anything? And if we never proved 
anything, does that mean we dont know anything, 
that all this is just a waste of time, not mathemat- 
ics at all? 

Or does it mean that there is something misguided, 
or wrong-headed, in the notion that mathematics 
is all about logical deductions from axioms? 

When I wrote up this bit of elementary algebra 
for publication, I included two little algebraic 
identities, stating the oddness and evenness of 
the sums of powers, a fact which can be read off 
directly from the enlarged tables. I asserted that 
these two identities can easily be proved by induc- 
tion. That task was left for the skeptical reader to 
conhrm, and for the less skeptical to accept. Other 
than that, the claims I made here were all stated 
without proof. The proofs are too simple and 
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elementary to justify taking up valuable journal | 
space. 

Does that mean that the whole article was too I 
simple and elementary to get published? No, not ' 
at all! It contains a new idea, a new approach, to a J 
very old topic. The fact that the new idea is as sim- $ 
ple as possible makes it more interesting, not less. I 

The idea, the device, of extending the sum func-1 
4 tions to negative n , is whats interesting. The I 
| proofs of the consequences of this simple idea are I 
I obvious, mere elementary exercises. Not worth f\ 
I publishing. 

* So, OK, the proofs arent published, but they 
are easily supplied. But how can there be proofs 
K without axioms? How do you even start a proof, 
without having the starting point, the axioms or 
assumptions or hypotheses? In fact, it is not hard [ 
to hgure out that the starting point here, and in 
fact in nearly all math proofs, is simply all that i 
we need to know and do know from established m 
mathematics. We are entitled to use all of that, I 
without apology. That is how mathematics is done. 1 
The mathematician asks him/herself, “What u 
do I know that would help solve this problem?” I 
Meaning, mostly, what do I know from estab- K 
lished math, that every mathematician knows, T 
or could know by looking it up. If what I already N 
know is inadequate, what does my ofhce mate I 
know? What could I hnd in the right book, ar- f 
ticle, or on-line source? I solve the problem by y 
thinking about the things involved in the problem J 
- mathematical entities - as I have them stored 
or represented in my mind/brain, and then also II 
“in the literature”. You might say, if you wanted to 1 
stretch it, that in all mathematical work, there is I 
already stated, as a hypothesis, all of established [j 
mathematics. 

The End Result 

The end result of a successful mathematical inves- I 
tigation is to add something to, or to make some I 
improvement in, the body of established math- 
ematics. To achieve this, the mathematician plays I 
around with his/her concepts, her mental mod- 
els, turning them upside down and inside out, i 


trying to get to where he/she wants to go. And m 
he/she calls in, as needed, anything and every- I 
thing already established in mathematics. Guess- 
ing by analogy, piling up examples, drawing pic- 
tures and erasing them and making new pictures, 
till something clicks, something makes sense, 
some new understanding is achieved. 

Then, as a hnal chore, arranging it logically so that I 
it will convince other interested mathematicians. I 
The proof need only dwell on anything unfamiliar, I 
anything non-routine. Whats interesting is the I 
new idea. If there is no new idea, there is probably I 
no reason for anybody to read about the work. The I 
proof is only whats needed to explain and con-1 
vince the appropriate reader - the mathematician I 
or maths student who has the appropriate back- r 
ground and preparation. So doing math requires, 
of course, the socialization, the indoctrination if 
you will, to enable the student to know what is re- 
quired to make his/her new result convincing to 
the intended audience, of qualified readers. Just 
as in any conversation, you dont bore people with 
a lot of old trivia that they already know all about. 

References 
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ematics. 
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T he Fundamental Theorem of Algebra states 
that any nonconstant polynomial p(z) over 
C has at least one zero; that is, p(z 0 ) = 0 for 
some z 0 e C. This easily implies that if the degree 
of p is n then there are n zeros, provided multiple 
zeros are counted correctly. Indeed, z - z 0 divides 
p(z) y and the quotient has degree n - 1, so we can 
proceed by induction. 

This result was widely used by Euler, Lagrange, 
and others, who offered various handwaving 
'proofs'. The hrst rigorous proof was given by 
Gauss in his doctoral thesis. It involved the ma- 
nipulation of complicated trigonometric series to 
derive a contradiction, and was far from transpar- 
ent. The underlying idea can be reformulated in 
topological terms, involving the winding number 
of a curve about a point, see [1]. 

Modern Proofs 

Other classical proofs use deep results in complex 
analysis, such as Liouvilles Theorem: a bound- 
ed function which is analytic on the whole of 
the complex plane is constant. This depends on 
Cauchys Integral Formula and takes most of a 
course in complex analysis to prove. See [3], 2.51. 
An alternative approach uses Rouchĕs Theorem, 
see e.g. [3], 3.44. Another proof - the hrst one I 
was shown as a student - uses the Maximum Mod- 
ulus Theorem: if an analytic function is not con- 
stant, then the maximum value of its modulus on 
an arbitrary set occurs on the boundary of that set. 



Carl Friedrich Gauss (1777-1855) a German mathemati- 
cian and scientist. Sometimes reterred to as Princepsmath- 
ematicorum (Latin, "the Prince of Mathematicians"), he is 
often considered one of history's greatest mathematicians. 

"EYPHKA!num=A+A+A" 

(A famous note written in Gauss' diary after he proved that 
every positive integer could be expressed as the sum of three 
triangular numbers) 
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A yariant uses the Minimum Modulus Theorem 
(the minimum value of its modulus on an arbi- 
trary set is either zero or occurs on the boundary 
of that set). See [2], Theorems 10.14, 10.15. Eu- 
ler s approach, which sets the real and imaginary 
parts of p(z) to zero and proves that the resulting 
curves in the plane must intersect, can be made 
rigorous. Clifford gave a proof based on induction 
on the power of 2 that divides the degree n , which 
is most easily explained using Galois theory. 

An Elementary Proof 

All of these proofs are quite sophisticated. But 
theres an easier way. Some years ago I found a 
simple proof using a few ideas from elementary 
point-set topology and estimates of the kind we 
encounter early on in any course on real analysis. 
I quickly discovered that it was already known to 
experts: you can find it on Wikipedia, for example. 
But it deserves to be more widely known, because 
it is simple and cuts straight to the heart of the 
issue. The necessary facts can be proved directly 
by elementary means, and would have been con- 
sidered obvious before mathematicians started 
worrying about rigour in analysis, in around 1850. 

The idea can be summarised in a few lines. As- 
sume for a contradiction that p(z) is never zero. 
Then [p(z)| 2 has a nonzero minimum value and 
attains that minimum at some point w e C. Con- 
sider points vona small circle centred at w, and 
show that [p(v)| 2 must be less than |p(w)| 2 for 
some v. Contradiction. 


Here are the details: 

Theorem 1 Ifp(z) is a non-constant polynomial 
over C, then there exists z 0 e C such that p(z 0 ) = 0. 

Proof Suppose for a contradiction that no such z 0 
exists. For some .R > 0 the set D = {z : \p(z)\ 2 < R} 
is non-empty. The map ip : C —> R + defined by 
ip(z) = \p(z) | 2 is continuous, so D = ip([0, R]) is 
compact. For a subset of C this is equivalent to 
being closed and bounded. It follows that \p(z)\ 2 
attains its minimum value on D. By the definition 
of D this is also its minimum value on C. 

Assume this minimum is attained at w e C. Then 

\p( z )\ 2 > \p( w )\ 2 ^GC and by assumption 
p(w) ^ 0. 

We now consider \p(z)\ 2 as z runs round a small 
circle centred at w, and derive a contradiction. 

Let he C. Expandp(w + h) in powers of h to get 

p(w + h) = p 0 + pih + p 2 h 2 H- b p n h n (1) 

where n is the degree of p. Here the are spe- 
cific complex numbers. They are in fact the Tay- 
lor series coethcients Pj = P {j) ( w )/j ] - but we dont 
actually need to use this, and (1) can be proved 
algebraically without difficulty. 

Clearly p 0 = p(w)> and we are assuming this is 
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nonzero, so p 0 * 0. If p^ = p 2 = . . . = p n = 0 then 
p(z) = p 0 is constant, contrary to hypothesis. So 
some pj * 0. Let m be the smallest integer > 1 for 
whichp m * 0. In (1), let h = ee ie for small e > 0. 

p(w + ee l6 ) = po + p m £ m e mie + 0(e m+1 ) 

Therefore 

\p(w + ee i6 )\ 2 = \ Po + p m £ m e mie \ 2 + 0(e m+1 ) 
= PoPo + PoPmC m e mie + PoP m £ m e~ mie + 0(e m+l ) 

Let p 0 p m = re^ for r > 0. Since p 0 * 0 and 
p m * 0 we have r > 0. Setting h = 0 we see that 
popo = \pM | 2 . Now 

| p(w + ee ie )\ 2 

= poPo + re i(t) £ m e mie + re~ i(t) £ m e- mie + 0(e m+1 ) 
= \p(w)\ 2 + 2e m r cos(m0 + </>) + 0(e m+1 ) 

Set 0 = ^(4> - i r), so that = rr — mO. Then 
cos (mO + </>) = cos(7r) = — 1, and 
| p(w + ee ie )\ 2 = \p(w)\ 2 - 2 e m r + 0(e m+1 ) 

But £, r > 0, so for suhiciently small £ we have 

\p(w + ee l6 )\ 2 < \p(w)\ 2 

contradicting the dehnition of w. Therefore there 
exists z 0 e C such thatp(z 0 ) = 0. 

The same idea can be adapted to give an equally 


simple proof of Liouville s Theorem: 

Theorem 2 If f(z) is analytic on the entire com- 
plex plane, and is not constant, then f(z 0 ) = 0 for 
some z 0 e C. 

The only new feature in the proof is that the poly- 
nomial in (1) becomes a power series, and now 
we really do need Taylor s theorem. 

Reterences 

[1] Ian Stewart; 1977; Gauss ; Scientihc American 
237 122-131. 

[2] Ian Stewart and David O. Tall; 1983; Complex 
Analysis\ Cambridge University Press; Cambridge. 

[3] Edward C. Titchmarsh; 1939; The Theory of 
Functions ; Clarendon press; Oxford. 

About the Author 

Ian Stewart FRS is a professor of mathematics at 
the University of Warwick, and a widely known 
author of science hction and popular science 
books. His research interests include dynamical 
systems, bifurcation theory and pattern forma- 
tion. He lists his recreational interests as painting, 
guitar, Egyptology and snorkelling. 


1100 













Pisces, by Mike Naylor 

This image is a visualisation of the sequence of bi- 
nary numbers from 0000 to 1111. The digits deter- 
mine the colour of each section, forming a pattern 
of seaweed and fish! 
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O ne of the more striking predictions of 
general relativity is the formation of black 
holes via the gravitational collapse of suf- 
hciently concentrated mass and/or energy. This 
prediction is also one of the most famous, in part 
due to popular science books like Stephen Hawk- 
ings A Brief History of Time bringing the concept 
into mainstream culture. Black holes have even 
made it to Hollywood - for example, in the 2009 
hlm Star Trek, the villain Romulan uses a black 
hole to consume the Vulcan homeworld. More 
recently, a "singularity" was created to stop the ar- 
mies of General Zod in 2013 s Man of Steel. 

The appearance of black holes in major blockbust- 
er hlms means that special effects teams have had 
to take a stab at depicting what one should look 
like. Disappointingly, the best Hollywood could 
come up with is not only wrong but downright 
unimaginative. Star Treks planet Vulcan turned 
black hole, arguably the worst, was nothing more 
than a circle shaped black silhouette. For the same 
companies that succeeded in creating such con- 
vincing CG hre and water in movies like Quantum 
of Solace and Ice Age 2 to fail completely when 
it comes to black holes suggests that there might 
be something intrinsically harder about the latter. 
However, this is not case. In this article I hope to 
convince the reader that realistic computer rendi- 
tions of black holes are not only straightforward 
to generate, but are also far more compelling and 
beautiful than anything produced by Hollywood. 

The Theory 

A natural place to begin is to hrst address the 
question of why a black hole is even visible at 
all. Strictly speaking, the answer here is the same 
as for regular matter like rocks and trees - black 
holes are visible because of the effect that they 
have on passing light. However, whereas the for- 
mer affect light primarily through rehection and 
absorption - that is through the electromagnetic 
force and quantum effects acting on a very small 
scale - in the case of a black hole it is the long 
range action of gravity that is important. 

General relativity predicts that light rays travel 
along geodesics in curved spacetime, meaning 
that light is both bent and focused by a massive 
body - an effect sometimes called gravitational 
lensing. This means that the presence of a mas- 
sive body in (for example) the night sky will affect 
both the apparent position and brightness of the 


stars in its vicinity. This effect was hrst conhrmed 
by Sir Arthur Eddington in 1919 when he meas- 
ured the (slight) deflection of starlight by the sun 
during a solar eclipse. 

Unlike the sun, which distorts its environment 
slightly but is visible mainly for other reasons, a 
black hole appears as a pure distortion in what- 
ever happens to be behind it. Therefore, a black 
holes environment is important to its appearance. 
Although space might seem to be the most natural 
setting, in this article we instead go with a theme 
of famous places - rendering locations like the 
great wall of China and the Eiffel tower as they 
would appear with a black hole in the vicinity. For 
the sake of simplicity we only consider distortion 
due to the deflection of light, neglecting changes 
in brightness due to focusing, and also colour 
due to gravitational redshift. For this, it sufhces 
to compute (numerically) the set of light-like geo- 
desics (i.e. light ray paths) starting from a fixed 
point in spacetime and expanding outwards in all 
directions. These rays are lines of sight connecting 
the observer and observed, determining what the 
human eye (or a camera) sees when looking into 
a black hole. Modifying the celebrated ray tracing 
algorithm from computer graphics to make use of 
these geodesics allows us to simulate gravitational 
lensing on the computer. 

First we need to settle on a suitable spacetime. 
The most obvious choice is probably the famous 
Schwarzschild solution - discovered in 1916 only 
about a month after Einstein published his gen- 
eral theory of relativity - describing a single non- 
rotating, uncharged black hole which has existed 
since the beginning of time. However, in this ar- 
ticle we will be working with a slightly more ex- 
otic creature - charged black holes. The addition 
of electric charge has (in the author s experience) 
no effect on the qualitative appearance of a black 
hole. At the same time, it comes with the impor- 
tant advantage of allowing us to visualize space- 
times containing more than one of them. 

In general, multi black hole spacetimes are highly 
complex, dynamic objects. The nonlinear inter- 
action of the black holes means that the metric 
cannot be written in closed form, and one must 
resort to the numerical solution of the Einstein 
equations. Numerically solving for geodesics on 
a manifold that is itself constructed numerically 
is a challenging task, and beyond the scope of this 
article. However, multiple charged black holes 
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(a) 


(b) 




provide a way out - when the amount of charge is 
just enough to exactly cancel their mutual gravita- 
tional attraction, the resulting spacetime is static 
and can be written in closed form. Such space- 
times belong to the Majumdar-Papapetrou fam- 
ily of solutions discovered in 1947 (see [2]), from 
which all pictures in this article have been created. 

Figure 1 (a) shows an example where a set of test 
rays are hred backwards in time from an observer 
situated in a Majumdar-Papapetrou spacetime 
containing a single black hole at the origin. The 
paths of the rays are calculated by numeri- 
cally integrating the geodesic equations. Notice 
that they appear to be divided into two catego- 
ries - those that are dehected as they pass the 
black hole but ultimately escape to inhnity, and 


those that are "captured" and fall into the origin. 

Here appearances are a little misleading - the ori- 
gin is actually the black hole s event horizon in our 
current choice of coordinates, and time (as meas- 
ured by a distant, stationary observer) is diverging 
to minus inhnity as the rays approach. Thus the 
apparently captured rays never actually make it 
inside the black hole (it s a good thing too because 
if they did, then their time reversals would be rays 
climbing out from inside the event horizon, con- 
tradicting the dehnition of the latter). Neverthe- 
less for convenience we continue to refer to these 
rays as captured. 

Bearing this in mind, let us try to imagine what 
our (single) black hole might look like. For the 




Figures 2(a) and 2(b) 
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Figure 3 (Thanks to Alexandre Duret-Lutz for the original picture) 


purposes of discussion it is conceptually easier if 
we assume all light sources are far away and point- 
like, and hence for the time being we assume a 
backdrop of stars. However, this makes for less in- 
teresting pictures so we will drop this assumption 
when it comes time to show some results. 

The apparent position of a star is determined by 
the tangent vector to the geodesic(s) joining the 
eye/camera with it - see Figure l(b). As is often 
the case for boundary value problems, this geo- 
desic is not unique - hence each star has multiple 
images. In fact, it turns out that a light ray may 
circle the black hole any number of times, either 
clockwise or counterclockwise, en route from a 
distant star to the observer. Hence each (physical) 
star maps to inhnitely many star images. 

The collection of captured rays correspond to a 
set of directions from which no light can reach 
the observer. This is called the black hole shadow, 
and appears as a great dark void - circle shaped, 
in the present case. Surrounding the shadow is 
a halo of stars - a result of the accumulation of 
inhnitely many star images into a hnite region of 
image space. 

Examples 

When the background consists not of point light 
sources but rather extended bodies, the situation 
is more complicated, but not a lot more. Figure 
2(b) shows a hrst example with a single black hole 
in front of the great wall of China (for reference, 
an undistorted photo is provided in hgure 2(a)). 
As in the discussion, we see a dark shadow region 
surrounded by duplicates of the objects in the 


scene. However, since these objects now occupy a 
hnite region in image space, it is possible for their 
various copies to partially merge, resulting in a 
kind of Siamese twin effect. This is illustrated in 
Figure 3, where two black holes hover in front of 
the Eiffel tower. 

It is worth mentioning a kind of gravitational 
fractal effect that only occurs when the number of 
black holes is two or more. In this case, the black 
holes have a lensing effect on each other , mean- 
ing that around each black hole there are multi- 
ple apparent copies of the other black holes. But 
then these copies are in turn surrounded by more 
apparent copies, and so on, forever. This is illus- 
trated (to a recursion depth of one) in Figure 4, 
using St. Johns College, Cambridge as a backdrop. 

Wormholes 

Either a tunnel connecting two otherwise separate 
universes or else two locations within the same 
universe, the concept of a wormhole is almost as 
famous as that of a black hole. However, whereas 
most physicists and astronomers today accept the 
reality of black holes as real astronomical objects, 
the existence of wormholes is far more suspect. 
While one can construct wormhole spacetimes 
which technically do solve the Einstein equa- 
tions, the caveat is that the local energy density 
(the technical term here is stress-energy tensor) 
has to take on values which many experts believe 
to be physically impossible (see [1] pp. 151). This 
detail has not stopped a parade of science hction 
hlms from featuring wormholes. It also doesnt 
stop us from computing what a wormhole would 
look like if it did exist (it goes without saying that 
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Figure 5 (Thanks to Alexandre Duret-Lutz for the beach 
picture and Marco Reinhardt for the monument picture) 


the movies all got it wrong). An example is shown 
in Figure 5 of a wormhole connecting the Berlin 
Holocaust monument to a beach in the Caribbean. 

Things start to get interesting when the number of 
wormholes is greater than one. Suppose we take 
our spacetime with a single wormhole connect- 
ing two universes A and B, and add to it a second 
wormhole also linking A and B. An observer O in 
universe A can see universe B through wormhole 1. 
But since universe B contains wormhole 2 lead- 
ing back to universe A, O can also see a miniature 
copy of their own universe (including a copy of 
themselves) by looking hrst through wormhole 1 
and then wormhole 2. The result is an inhnite cas- 
cade of wormholes within wormholes, similar to 
the effect of standing between two large mirrors 
on opposite sides of a hallway. 


One can take this game further and construct 
spacetimes with increasingly complex topologies 
by adding more wormholes and/or universes, re- 
sulting in fascinating pictures. To be appreciated 
properly these kind of images need to viewed at a 
higher resolution than is possible in a magazine 
- therefore I refer the interested reader to view 
them on my website, www.maths.cam.ac.uk/post- 
grad/ cca/people/lrh30.html. 
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Sums 

S uppose that we partition the natural numbers 
into hnitely many classes. Can we always hnd 
x andy such that all of y and x + y are in the 
same class? Equivalently, if we ‘hnitely colour the 
naturals, can we hnd x, y, z of the same colour with 
x + y = z? This is a typical question in Ramsey the- 
ory, which seeks to answer questions of the form 
‘can we hnd some order in enough disorder?’ In 
this case, the disorder is the unknown colouring, 
and the little patch of order is the x,y,x + y. (It is 
worth mentioning in passing that the reader can 
deduce from the asking of the question that we 
are not considering 0 as a natural number. More 
interesting would be to ask ‘are we allowed x = y, 
but in fact it is quite easy to hnd colourings that 
ensure that we cannot have x = y, so in fact it 
makes no difference whether or not we insert the 
word ‘distinct’.) 

For example, if we colour every even number red 
and every odd number blue then we could take 
x = 4 andy = 6. If we colour every square red and 
every non-square blue we could take x = 5 and 
y = 6 (or indeed x = 9 andy = 16). In any particu- 
lar example, it may be quite easy to hnd a suitable 
x and y, but is this always the case? It turns out 
that the answer is yes. This is called Schur s theo- 
rem, dating from 1916. Schurs theorem is not too 
hard to deduce from Ramsey s theorem - indeed, 
it often appears on the examples sheet for the Part 
II Graph Theory course, in the section on Ramsey 
Theory. 

What about sums of more terms? How about all 


of the seven non-zero sums from x, y, z? And in 
general, how about FS(x ly . . . , x n ), meaning the 
set of all sums where I is a non-empty sub- 
set of the index set? This is much harder: it is not 
on any Part II sheet that I am aware of. It is true, 
though, that (for any n) whenever the naturals are 
hnitely coloured there exist x x , ... ,x n such that 
FS(x ly ... , x n ) is all one colour. This is called the 
Finite Sums theorem, and it is a special case of a 
famous theorem of Rado from 1933. 



Frank Ramsey (1903-1930) was a brilliant mathematician, 
philosopher and economist. Ramsey'sTheorem was in fact 
proved by Ramsey as a minor lemma on the way to a re- 
sult in logic. He suffered chronic liver problems, dying from 
jaundice at the age of just 26. (Photo courtesy of Stephen 
Burch, grandson ofFrank Ramsey :) 
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Products 

What about products? Can we always find, in a 
finite colouring of the naturals, three numbers 
x, y, z of the same colour with xy = z (apart from 
x = y = z = 1, of course)? Actually, it turns out that 
this is a silly question: it follows immediately from 
Schur s theorem, just by restricting our attention 
to the powers of 2. Indeed, given a finite colouring 
of the naturals, consider a new’ colouring where 
the new colour of v is the original colour of 2 X . For 
this new colouring, Schur s theorem tells us that 
we can find x,y>x + y of the same colour, and this 
translates to 2 X , 2 y , 2 x+y having the same colour in 
the original colouring, as required. 

And similarly, of course, for products of more 
terms. We can always find a set of the form 
FP(x ly ..., x n ) (meaning all non-empty products) 
that is just one colour, by exactly the same argu- 
ment. 

Sums and Products 

So far so good. We have dealt with sums, and with 
products. How about combining them? So it is 
true that whenever the naturals are finitely col- 
oured we can always find x and y such that the 
four numbers x, y, x + y, xy are the same colour? 
More generally, of course, we would like to find 
(for any given n) numbers x x , ... , x n such that 
FS(x x , . . . , x n ) u FP(x v . . . , x n ) (in other words: 
all sums and all products) is one colour. Actually, 
even more generally one would like to be able to 
iterate the summing and the producting (with- 
out repeating a term): so for example for three 
terms one would really like to have x, y,z,x + y, 
x + z, y + z, xy, xz, yz, x + y + z, xyz and also 
xy + z,xz + y,yz + x and x(y + z),y(x + z), z(x + y). 

The answer is: nobody knows. This is an open 
problem. It has been thought about a lot, but with 
no success. Remarkably, leaving aside all this 
£ what we would really want’ stuff, it is even an open 
problem for the case of just two terms! In other 
words, it is unknown whether or not whenever the 
naturals are finitely coloured there exist x and y 
such that all of x,y,x + y, xy are the same colour. 
It seems utterly scandalous that this very small 
starter case should remain unsolved. Actually no- 
one even has any ideas or hunches. No-one has 
ever put forward a proof idea or proof scheme 
that even looks for a moment like it might have 
any chance of working. In the other direction, no- 
one has ever come up with a colouring that might, 
even with 5 seconds of thought, have the potential 


to be a counterexample. To put it another way, for 
any actual colouring you are told it is invariably 
very easy indeed to find such x and y. 

Part of the reason for the lack of progress on 
a proof might be what one could call a Tack of 
portability’ If we prove some statement about 
sums (like for example that if we colour the num- 
bers from 1 to 100 with 2 colours then we can 
find a solution to x + y = z in one colour class), 
then we can repeat it: if we look at the even num- 
bers from 2 to 200 we will find the same object 
(in pretentious language, there are plenty of ho- 
momorphisms from the group Z to itself). This 
‘repeating’ feature turns out to be of great impor- 
tance in most proofs of results about sums. And 
the same for products: if we know something 
about products inside the numbers from 1 to 100 
then we know the same statement about the set of 
square numbers from l 2 to 100 2 (again, there is a 
pretentious way to say this). But, sadly, there is no 
way to transfer results about sums-and-products 
around the place - there are not too many ring 
homomorphisms from Z to itself! 

It is also possible to give a reason why bad colour- 
ings are so hard to think of. To find a colouring 
without these sum-and-product structures, one 
would need a colouring that ‘meshes well’ with 
addition and multiplication. Now, there are lots 
of colourings that mesh well with addition (for 
example, colouring by value modulo something, 
or colouring by least signihcant nonzero digit in 
base something), and similarly for multiplication 
- but nobody has come up with a colouring that 
meshes with both. 

I have left the most embarrassing Tack-of-knowl- 
edge’ until last. We have said that, even in the sim- 
plest possible case of £ two terms’ i.e. x,y,x + y, xy, 
the answer is not known. But what if we go even 
further, and ignore the colour of x and y them- 
selves? So the question becomes: is it true that, 
whenever the naturals are finitely coloured, there 
exist x and y such that the two numbers x + y and 
xy have the same colour as each other? (We ex- 
clude the case of x = y = 2 in this, or equivalently 
we insist that x andy are distinct.) 

Even this super-special small case is unknown. It 
can be checked on computer for a small number 
of colours (I think it is known for up to 5 colours), 
but that is not much evidence in its favour. This is 
one of the most maddening, and most tantalising, 
problems in the whole of Ramsey theory! 
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The Axiom o Choice 

Robin Elliott 
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T he axiom of choice (AC) states that for any 
collection X of non-empty sets, there ex- 
ists a function from X to the union of all 
sets in X such that/(A) e A V A e X. The slogan 
is “you can choose an element of each set in any 
collection of non-empty sets”. This may seem like 
an axiom that states the bleeding obvious, but this 
is only because any intuition we may have of a 
“choice function” will appeal to hnite cases, or sets 
with “small” inhnite cardinality, such as the natu- 
rals or reals. The power of AC is that it applies to 
an arbitrarily large collection of non-empty sets. 
In the words of Douglas Adams, “Sets can be big. 
Really big”. Given an arbitrarily large collection of 
non-empty sets L/ each of which themselves can 
be arbitrarily large, we have no hope of explicitly 
constructing a function that chooses an element 
in each of the U if especially if we have little to 
no information about each of the U t . Indeed, in 
general we cant explicitly construct such a choice 
function, since AC is independent of the rest of 
standard (ZF) set theory. AC, therefore, steps in 
to guarantee a non-constructive existence of such 
a choice function. 

The Prisoners and Hat Puzzle 

Suppose n prisoners are positioned on the inte- 
gers 1, 2,..., n on the number line, and all are fac- 
ing in a positive direction: prisoners can see other 
prisoners standing on larger integers than they are 
at, but not lower integers. An executioner places a 
black or white hat on each of the prisoners heads 


and stipulates that, in some order, each prisoner 
must guess the colour of his hat. If he guesses 
correctly he lives and if he guesses incorrectly he 
dies. The prisoners are not allowed to communi- 
cate except for exclaiming their guesses of either 
black or white, although they are allowed to agree 
on a strategy prior to being lined up. What strat- 
egy should the prisoners employ to minimise the 
number of deaths, and how many prisoners are 
going to die? 

At hrst thought, it seems like prisoners may as 
well guess randomly, since although they can see 
some other prisoners’ hats, this gives them no in- 
formation about the colour of their own hat. This 
leads to half of the prisoners dying, and half going 
free. You may be surprised, therefore, to learn that 
there is a strategy in which all but one prisoner is 
assured freedom. 

The strategy a mathematically-minded set of pris- 
oners would devise is as follows. Each prisoner 
counts the number of black hats he can see in 
front of him, and remembers the parity of this 
count. Then the prisoner standing at 1 exclaims 
“black” if he sees an odd number of black hats, and 
“white” if he sees an even number of black hats. 
Although he may die as a result of this, the pris- 
oner at 2 now knows the colour of his own hat: if 
1 called “black” and 2 sees an odd number of black 
hats, he knows he must have a white hat. Similarly 
if 1 called “white” and 2 sees an even number of 
black hats, 2 knows he is wearing a black hat. For 
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the other two cases, 2 knows he is wearing a white 
hat. 

But then the prisoner standing at 3 can also de- 
duce the colour of his hat in a similar manner: he 
knows the colour of the hats of everyone 1 could 
see apart from his own, and from this he can de- 
duce the colour of his own hat. Continuing in- 
ductively, all subsequent prisoners can correctly 
identify the colour of their own hats. 

More Prisoners, with Earmuffs 

Lets have the same setup as before, except this 
time we have inhnitely many prisoners standing 
on the naturals 1, 2, ... instead of hnitely many 
prisoners. We further stipulate that each prisoner 
wears earmuffs! That is, each prisoner can now no 
longer hear anything any other prisoner says. The 
above strategy, cunning as it may have been, will 
clearly no longer work. What strategy should the 
prisoners employ now? 

More Prisoners, More Hats, 
with Earmuffs 

Let s up the ante further. We have the same setup 


as before, (i.e. inhnitely many prisoners, who can- 
not hear each other) but instead of two possible 
colours for the hats, we have uncountably many 
colours. To give a concrete example, the execu- 
tioner could assign a real number to each prisoner, 
with the extra condition that the prisoner knows 
the real numbers assigned to all prisoners higher 
up than him on the number line. If the prisoners 
were to guess randomly, they would all die with 
probability 1: if I tell you Lm thinking of a real 
number (it follows a normal distribution, say), 
you have probability 0 of guessing it hrst time 
round. 

Remarkably, in this set-up - and indeed the 
previous one with inhnitely prisoners and two 
hat colours - we can ensure that all but hnitely 
many prisoners live. That is, past some prisoner 
standing at N 0 , we have that everyone remaining 
correctly guesses the real number that has been 
assigned to them. The strategy relies heavily on 
AC and is therefore, like all proofs relying on AC, 
non-constructive. 

Strategy with Choice 

We present here the strategy that works for an 



10111 




infinite number of earmuffed prisoners, and the 
‘colours of the hats members of any set X. Consid- 
er the set X n of sequences in X, that is, sequences 
(x v x 2 , ...) such that x { e X V i. Define an equiva- 
lence relation ~ on X N by A, B e X n having 
A ~ B if and only if A and B differ only in finitely 
many terms. That is, A ~ B if there exists some N 0 
such that, for all n > N 0 , A n = B n . Checking ~ is 
an equivalence relation is not too strenuous, and 
once this is established we can consider the equiv- 
alence classes X N /~. Each equivalence class con- 
sists of “sequences in X that eventually end up the 
same”, so for example if X = {black, white} then 
one equivalence class ofX would be “all sequences 
which eventually end up all black” Another would 
be “all sequences which eventually have black in 
even positions and white in odd positions”. 

Now, for each equivalence class C in X N /~, pick 
an element in C. This seemingly simple statement 
is where we have invoked the axiom of choice: 
X N /~ is our collection of non-empty sets, and by 
choosing an element in C for all C in X N /~, we 
are assuming the existence of a choice function 
on the elements of X N /~. So we have a set T such 
that for each C in X N , there exists an S e T such 
that C ~ S. In more informal terms, we have a set 
T which contains examples of all possible ways a 
sequence in X can eventually end up. 

We are nearly there. The prisoners agree before- 
hand on the set T , which they memorise. Then 
the strategy is simple: each prisoner inspects 
all those in front of him, and so can tell which 
equivalence class of X N /~ he is in. He recalls the 
element s of T corresponding to that equivalence 
class, and if the prisoner is at position n then he 
exclaims the nth element of s. 

Why does this work? All prisoners, no matter 
what position they are in, will recall the same ele- 
ment s of T given their viewpoint of the rest of 
the sequence. If the actual sequence of hat colours 
does not correspond to the sequence s, it must 
at least differ from s only in finitely many terms, 
since it is equivalent under the relation ~ to s. So 
there is a point at which the actual sequence of 
prisoners’ hats will always agree with s, and any 
prisoners past this point will correctly identify 
their own hat colour. 


Further Variations, Additional 
Remarks 

1. It is interesting to consider further the case 
without earmuffs, and also without using the ax- 
iom of choice. We may have a greater cardinality 
of prisoners (in some totally ordered set), or of hat 
sizes. The prisoners can only transmit informa- 
tion from a set of the same cardinality as the set 
of hat sizes (e.g. in the black/white case, no pris- 
oner could say a real number; they would have 
to say black or white). For example, in the case 
with real-numbered hats between 0 and 1, the 
reader may like to find a convergent subsequence 
the first prisoner can sum to guarantee the lives 
of all of the other prisoners. Indeed, the reader is 
invited to think about which sets X have the same 
cardinality as X N , and why for these X the pris- 
oners can ensure that all but the first survive. 

2 . If the warden knows the quotient set T that 
the prisoners have chosen, he can ensure that any 
prisoner of his choosing will die, by choosing the 
element s e T and then giving said prisoner a hat 
colour not equal to his corresponding hat colour 
in s. Moreover, if the warden knows the quotient 
set T he can choose any finite set of prisoners to 
die. 

3 . Editors Note: Extending this idea, the reader 
may like to consider what happens when the war- 
den chooses each hat colour according to some 
continuous distribution. What is the probability 
that a given prisoner dies? What is the probability 
that the first N all die? He/she may then recall the 
continuity property of probability. (The interested 
reader may wish to consult a text on measure the- 
ory, such as [3].) 

References and Further Read- 
ing 

[1] Prisoners and hats puzzle; Wikipedia; http:// 
en.wikipedia.org/wiki/Prisoners_and_hats_puz- 
zle. 

[2] Thomas J. Jech; 1973; The Axiom of Choice ; 
Dover. 

[3] Marek Capinski, Ekkehard Kopp; 2004; Meas- 
ure, Integral and Probability; Second Edition; 
Springer. 


11000 








Mandelbulb 

^ Mandelbulbs are the 3D analogue of 
Mandelbrot sets.There is no canonical 3D 
Mandelbrot set, since there is no 3D ana- 
logue of the 2D space of complex num- 
bers. This mandelbulb is constructed us- 
** ing a ninth degree polynomial iteration, 
C with powers defined in terms of spherical 
coordinates. 
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A s I was shopping online one day, an adver- 
tisement for Spot it!® caught my eye. This 
game is played with 55 circular cards. Each 
card has several images, and each pair of cards has 
exactly one common image (see Figure 1). Sev- 
eral games can be played with the deck, all involv- 
ing multiple players trying to be the hrst to spot 
the matching image between two cards. With my 
curiosity piqued, I purchased the deck, which is 
made by Blue Orange Games. I quickly discovered 
that the deck is two cards shy of fully represent- 
ing an order 7 hnite projective plane. It seemed a 
natural course of action to create the two missing 
cards and then proceed to arrange the cards into a 
conhguration which would make it easy to dem- 
onstrate the order 7 hnite projective plane. I didnt 
realize how fun and challenging this would be. Em 
hoping the rules (and solution) of this single-play- 
er challenge will be entertaining to mathemati- 
cians and game-lovers alike. For those who would 
like to play, but who don't have a Spot it!® deck, 
interactive games are available on my web page: 
http: / / www. donnadietz.com/Proj ective. html. 

Background 

Before discussing the higher order hnite projec- 
tive planes and afhne planes, let us address the 
order 3 case (see Figure 2). In both the projective 
plane and the afhne plane, points are connected 
by "lines", and any lines not sharing a point are 
parallel. In the afhne plane, there are four sets 
of three parallel lines (which share a colour), 
each with three points. However, in the projec- 
tive plane, there are four additional points and 



Figure 1 These two cards have an image in common 

(Thanks to Theirry Denoual, co-founderofBlue Orange 
Games, for permission to use this artwork.) 


no parallel lines. An afhne plane of prime or- 
der n contains n 2 points and has n + 1 sets of n 
parallel lines, each with n points. The associated 
projective plane contains n 2 + n + 1 points and 
n 2 + n + 1 lines, each pair of points sharing a line 
and each pair of lines sharing a point. From the 
projective plane, any n + 1 (here, four) collinear 
points may be removed, along with all of their in- 
cident lines, and an afhne plane results. 

[1] is a great overview of the hnite geometry be- 
hind Spot it!®. For those whose interest in hnite 
geometry is beyond the scope of this discussion, I 
recommend [2] for those who have not yet mas- 
tered abstract algebra, and [3] and [4] for those 
who have. This technique generalizes for other 
orders, although its utility as a game diminishes 
due to the quadratic growth of the set size. This 
discussion is arranged so that those wishing fewer 
clues can read fewer sections, thus leaving more 
of the fun for themselves. 
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Figure 2 The order 3 projective (left) and affine (right) planes 


Finding the Missing Cards 

I will presume for the moment that the reader 
wishes to hnd the two "missing" cards in a Spot 
it!® deck. First, list the images in the deck. A quick 
way to do this is to locate an image which occurs 
eight times (that is, n + 1), and pull out all those 
cards. For example, if the deck has eight spiders, 
pull out the spider "set". There can be no other im- 
ages common to any pair out of those eight cards. 
Thus, there must be 8 x 7 + 1 or all 57 images on 
those eight cards. Next, tabulate the frequencies 
for each image. One image is present only six 
times, while 14 images are present seven times. 
All other images should be present eight times, 
as they are not missing from any card. The image 
which is missing twice must be assigned to both 
missing cards. Without loss of generality, assign 
one additional missing image to one of the miss- 
ing cards. Call this image the "reference image". 
For the remaining 13 images, search the entire 
deck to see if it occurs with the reference image or 
not. If it does, it cannot do so again, so it must be 
assigned to the other missing card. If, however, it 
does not occur with the reference image, it should. 
So it should go on the missing card which has the 
reference image. 

The Rules 

Begin by removing all cards having a specihc 
common image which we will call the "inhnity 
image" The remaining cards form an afhne plane 
of the order 7 (or n). (In my Spot it!® deck, the 
two missing cards both contain a snowman. So, 
if I simply remove all the snowman cards at this 
step, I do not actually need to fmd the two missing 


cards in order to proceed.) The inhnity cards are 
analogous to the orange points in Figure 3. 

The ultimate goal is to lay out the remaining 
49 (or n 2 ) cards in a 7 x 7 (or n x n) grid so 
that this one rule is satished: Given any two 
cards in the grid, with positions given by (x, y) 
and (x + h, y + k) (with x and y numbered 
between zero and six inclusive), the common 
image of the two cards must also be present at 
position (x + 2 h (mod 7), y + 2k (mod 7)) (or 
(x + 2 h (mod «), y + 2k (mod «))). For exam- 
ple, in Figure 3, the solved n = 5 case is shown. 
Consider the (row, column) positions (1,1) and 
(2,4). The red coloured circle (with a plus symbol) 
is common to these cards, so at (3, 2), we expect 
to fmd this symbol again, and indeed it is there. 
Since seven (or n) is prime, and all elements are 
generators in Z 7 (or Z„) there will be seven (or n) 
such images in a set, within the 7 x 7 (nx n) grid. 
This also implies that each row (and each column) 
will have a common image. (The symbols in Fig- 
ure 3 give the same information as the colours. 
For example, all the red symbols have a "plus" sign 
on them.) 

The families of parallel images in the n = 5 afhne 
plane are analogous to the parallel lines in Figure 
2. For one parallel family, all images lie on lines 
having slope 1. In Figure 3, this is the coloured 
triangle family. The purple coloured triangles 
lie on the line r = c + 1 (mod 5), while the red 
ones lie on the line r = c + 4 (mod 5). The blue 
coloured squares are from another parallel fam- 
ily, and they lie on the line r = -2 c + 2 (mod 
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5). (Another equivalent equation for this line is 
c = 1 + 2r (mod 5). 

The method for creating such a deck of cards 
should be obvious now. First, generate the athne 
grid using parallel lines. Then, each parallel image 
family is placed on a new card, together with the 
inhnity image. However, the goal here is not to 
create such a deck but rather to properly arrange 
an already existing deck. Those readers desiring 
the maximal fun should now attempt to solve this 
inverse problem without reading further. One 
more warning will be given after the easy clues 
are presented. 

Initial Setup of the Grid 

First, pull out any set of eight cards which share an 
image, to be used as the inhnity cards. (Or, if the 
missing cards have not been created, the six cards 
sharing the twice-missing image should be pulled 


aside.) Next, select one of the inhnity cards to rep- 
resent your row family and one to represent your 
column family, and keep them in view. Arrange 
your grid so that each column contains a common 
image and each row contains a common image. 
(Note that there are 57 x 8 x 7 x 7! x 7! ways to 
make these choices. There are 57 images to pick 
as the inhnity image, then eight cards from that 
inhnity set which can be used to dehne the rows. 
Once the rows are chosen, seven cards remain to 
dehne the columns. There are 7! ways to order 
rows and 7! ways to order columns. Also, note that 
the n = 3 case is fully solved at this stage.) Next, by 
swapping rows and/or columns of cards, place a 
common image on the line which runs from the 
lower left corner to the upper right corner (r = c), 
which we’11 call the hrst diagonal from now on. 
All "moves” henceforth will consist of swapping 
two rows or swapping two columns. We know 
from abstract algebra that all permutations can 
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Swap rows 2 and 3, then swap columns 2 and 3. Squares are rearranged. 


Figure 4 Squares on the diagonal deforming and returning again 
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Figure 5 Choosing the correct image for the second diagonal 


be formed by swaps, so swapping rows/columns 
is sufficient to solve this puzzle. 

The next objective is to get a common image on the 
second diagonal, which we dehne to run from the 
upper left corner to thelower right corner (r = -c -1). 
Swapping rows will preserve the common image 
along the hrst diagonal as long as the columns 
with corresponding indices are also swapped. For 
example, if rows 2 and 3 are swapped, then col- 
umns 2 and 3, the net effect on the line r = c is to 
swap the card at (2, 2) with the card at (3, 3). This 
technique allows us to maintain the r = c diago- 
nal while still giving enough freedom to hnish the 
puzzle. Without loss of generality, freeze the mid- 
dle card. Thus, you will not move the middle row 
or middle column. (The fact that we are working 
on a torus allows us to freeze any one card, but 
the choice of the middle card makes the solution 
easier to execute, due to symmetry.) This gives 
6! = 720 remaining grid arrangements, six of 
which are valid solutions. 

Finding the Second Diagonal 
and Finishing the Puzzle 

At this point, if you wish to have any fun with the 
puzzle, you should stop reading. This is your last 
warning! 


The somewhat surprising fact is that once you 
have chosen the image for the line r = c and have 
frozen the middle card, the image for the line 
r = -c - 1 is already determined. It has to be one 
of the images on your middle card, but it cannot 
be the image of that row or of that column or of 
the line r = c. So there appear to be five options 
remaining. But that is not so. Only one will work, 
and any attempt to set the incorrect image will 
end in frustration! So how do we figure out which 
image will work? 

Since all moves are reversible, we may simply 
track the scrambling process to see where images 
on the second diagonal r = -c- 1 may go when we 
allow paired swaps of rows/columns. We imagine 
three (that is, TzA) concentric squares around the 
frozen middle card to help us keep track of where 
the second diagonal set can go. Figure 4 demon- 
strates one of the 15 possible deformations of the 
second diagonal and also demonstrates how, as 
promised, the cards along r = c are, as a set, invari- 
ant. The white circles represent the cards on the 
line r = c, while the orange circles represent those 
which will ultimately be on the line r = -c - 1. 
The (group) actions of swapping matched rows/ 
columns maintains these three sets of four cards 
as corners of squares which are symmetric about 
the line r = c, though they are not concentric ex- 
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Figure6The 15 ways the cards of the second diagonal may initially appear 


cept when the two diagonals are both correctly set. 
To determine which of the five candidate images 
should be used as the image for the line r=-c- 1, 
we simply search for the image which is already 
symmetric with respect to the line r = c (see Fig- 
ure 5). We can think of the six non-fixed cards 
along r = c as being in three pairs, thus inducing 
the squares. Combinatorially, this can occur 15 
ways, as shown in Figure 6. Swap rows and corre- 
sponding columns until the images on the second 
diagonal are set. Note that, for the order 5 case, 
even the incorrect candidate images for the sec- 
ond diagonal are arranged symmetrically with re- 
spect to r = c. Look for four misplaced candidate 
images instead of two. 

Once the two diagonals are set, remaining moves 
must not only have paired row/column moves, 
but must also maintain symmetry between right 
and left (as well as up and down). For example, 
if rows/columns two and four are swapped, this 
is already balanced and will not disrupt either 
diagonal line. However, if columns zero and one 
are swapped, columns five and six must also be 
swapped (as well as rows zero and one, and also 
rows five and six). By tightening these orbits, we 
close in on one of the six solutions. 

Final Moves 

Once the two diagonals are established, there are 
still 48 possible card arrangements. Each "square” 
may be in each of the three locations (3! = 6), 


and each has two legal orientations as it is legal 
to rotate it 180°. Since 2 3 = 8 and 6 x 8 = 48, there 
are 48 possible arrangements of the cards, six of 
which are solutions. For example, you have the 
freedom to choose any one squares location and 
orientation, but the rest is then predetermined. 
For simplicity, we presume that the innermost 
square is set properly, i.e. the nine cards in the 
middle of the grid are now fixed. 

Now, using what is known about the families of 
parallel images, move the cards into their final 
positions. For example, since we now know which 
family of images has a slope of -1, we can deduce 
which of those images should appear on the cards 
in positions (0, 4) and (1,3) just by looking at the 
card in position (2, 2). If an image is not in the 
desired location, look for it on the opposite side of 
the affine grid, relative to a 180° rotation about the 
centre. You might also need to swap the middle 
and outermost squares. In a few moves you will 
see before you a perfectly arranged affine plane! 
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A Vortex Street 


These clouds off the coast of Chile 
showed a beautiful Karman vortex 
street, a pattern in fluid dynamics 
which can be caused by fluids flow- 
ing past a blunt obstade - in this 
case, the Juan Fernandez Islands. 
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T he Prisoner s Dilemma game is the math- 
ematicians' gift to moral philosophy. The 
odd name of the game comes from the 
story of two prisoners accused of a joint crime. 
They are kept in separate cells. The state attorney 
visits them in turn, and otTers each a deal. If one 
of them confesses, he will become a witness for 
the prosecution, and immediately released; the 
other one will be sentenced to ten years. If both 
confess, however, there is no need for a witness 
for the prosecution, and both will face seven years 
in jail. Of course, both could refuse to rat on the 
other. In that case, the attorney promises to keep 
them in custody for the maximum period legally 
allowed, one year. No matter what the other is go- 
ing to decide, the best response for either of them 
is to confess. Hence both confess. Each will have 
seven long years to ponder the diabolic nature of 
the attorney s offer! 

A prisoner with a gift for abstraction may come 
up with a matrix resuming the conundrum with 
a chain of inequalities. In the tradition-bound 
jargon of game theory (the mathematics of con- 
flicts of interest), two players 1 and 2 each have 
to decide between two strategies, namely C (to 
cooperate) and D (to defect). This yields four pos- 
sible outcomes for player 1, given by R, S, T or P 


(see Figure 1). Player 1 prefers T to P, R to P, and 
P to S; if the outcomes correspond to numbers 
(so-called utilities or payoffs ), T > R > P > S. If 
player 2 chooses C, player 1 is better off playing 
D, since T > R; if player 2 chooses D, player 1 is 
better off playing D, since P > S. Hence, no matter 
what player 2 chooses, player 1 should opt for D. 
But since the same goes for 2, who is in exactly 
the same situation, both end up with P. And the 
reason that they have forfeited the better outcome 
R is, ironically, self-interest! 

The Iterated Prisoner's 
Dilemma 

The situation changes considerably if the two 
players interact not only once (the so-called one- 
shot Prisoner s Dilemma) but repeatedly (an iter- 
ated Prisoners Dilemma). In daily life, repeated 
interactions are the rule rather than the exception. 
One can rarely be sure of seeing the last of an- 
other person. If the likelihood of another encoun- 
ter is high, players can hope to influence their 
co-players behaviour. It is possible to react. In 
the gift-giving game, for instance, players will an- 
ticipate that if they fail to give in one round, they 
will fail to receive in the following round. And our 
'prisoner' may thus be swayed by the prospect that 
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the co-player will have an opportunity for 'getting 
even'. 

In the iterated Prisoners Dilemma game, the 
strategies must specify what to do in each round. 
For instance: defect in each round (this is the 
strategy AllD). Or else: play C in round n iff n is 
prime. Or else: play C with probability 0.5. More 
interesting opportunities are provided by strate- 
gies conditioned on the co-players actions. For 
instance: play C in the hrst round, and from then 
on, play C iff the co-player used C in the previous 
round (this is the strategy Tit For Tat, TFT). Or 
else: play TFT in the hrst hundred rounds, then 
AllD for the remainder of the game, etc. 

Memory-One Strategies 

Strategies for the iterated Prisoner s Dilemma can 
be extremely complex; the only restriction is that 
they cannot depend on the future. A particularly 
simple class of strategies - the memory-one strat- 
egies - have the property that the players next 
move depends only on the current state of the 
game. The state in round n is determined by the 
payoff k e {R, S, T, P} obtained by player 1 in that 
round. Thus a memory-one strategy is given by a 
quintuple (p R , Ps>Pt>Pp> Po)> where p 0 is the proba- 
bility to play C in the initial round (n - 0) and the 
vector p = (p R , ps> Pt>Pp) denotes the conditional 
probabilities to play C in the next round, when the 
current state is k e {R, S, T, P}. If the probability 
6 for another round is fixed (with 0 < 6 < 1), then 
the number of rounds is a random variable with 
expectation —If n x (n) is player 1 s payoff in 
round n, then the total payoff is E and the 

A y n=1 

payoff per round is defined as the Abelian mean: 

ttJ := (1 - S) ^^ n 7Ti(n) 

n= 0 

The same holds for player 2 s payof¥ n 2 . In the lim- 
iting case 6 = 1, the payoff per round is given by 
the Cesaro mean: 

N 

7Ti lim (X + 1) _1 7Ti(n) 

AM-oo ' 1 

n= 0 

This limit need not always exist: for instance, if the 
co-player plays C in the first ten rounds, D in the 
next hundred rounds, C in the following thou- 
sand rounds, D in the next ten-thousand rounds 
etc, then the mean payoff is likely to oscillate end- 
lessly. But if the Cesaro mean exists, it is the limit 
of the Abelian mean, for S -> 1. The pair (n ly n 2 ) 


is in the quadrangle Q spanned by the four points 
(R, R), (S, T), (T, S) and (P, P) (Figure 2). If player 
1 adopts a specific strategy, then the pair (n ly n 2 ) 
depends on player 2s choice, and will typically 
range over a two-dimensional subset of Q. But 
when player I adopts a so-called zero determinant 
(ZD) strategy, then the pairs are restricted to a 
line. The construction of such strategies is not too 
tricky, and further details are provided in the ap- 
pendix. 

If player 1 adopts a ZD strategy, then 

7T2 - K = x(n - k) 

irrespective of the strategy adopted by player 2. 
This means that player 1 can unilaterally enforce 
a linear relation between the two payoff values - 
the two prisoners are chained together. Thus the 
payoff pair (n Y , n 2 ) is restricted to the line with 
slope X intersecting the diagonal in (k, k). The 
value of k corresponds to the payoffs of the two 
players when both use the same strategy. If the 
parameters of the Prisoner s Dilemma game satisfy 
P < then one can easily show that k is 

between P and R, whereas X is between -1 and 
1. Let us consider a few instances. By choosing 
X = 0, player 1 can assign payoff k to the co-player. 
Player 2 need not be restricted to memory-one 
strategies, but can switch capriciously between 
Cs and Ds. Player 2’s actions can affect only the 
payoff for player 1, but not his own payoff n 2 . The 
converse aim (to assigns a specific value to his 
own payoff) cannot be realized. Indeed, since the 
gradient of the payoffs line is at most one, it can- 
not be vertical. 

Some Classes of Strategies 

By choosing k = P and X = 1/2, player 1 can act 
as extortioner. Whenever player 2 attempts to 
gain a surplus over and above the maximin value 
P, player 1 s surplus will be twice as large. When 
two extortioners meet, their surplus will be zero, 
of course. 

Possibly the most surprising class of ZD strategies 
is obtained for k = R and 0 < X < 1. Players using 
such a strategy provide payoff R to their co-play- 
er, as long as they themselves receive R. If their 
payoff is less than R, however, then the co-play- 
ers’ payoff is also reduced. This seems to be just 
the type of strategy that conditional cooperators 
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have in mind. It is a strategy of £ live and let live’ 
but with a barb: £ if you push me down I will take 
you down! No self-interested co-player will have 
reason to cheat on such a player, or both payoffs 
will be reduced (that of the co-player is actually 
reduced by less, since 0 < X < 1). Such strategies 
fare remarkably well, despite their readiness to 
shoulder the larger part of an eventual loss (with 
respect to R). They seem to embody the spirit of 
partnership. 

If player 1 sees the co-player not as a partner, but 
as a rival, then l's main aim will be to ensure that 
> n 2 . The only ZD strategies which achieve 
this are strategies with k = P, such as the extor- 
tioner strategies. (Aiming for the strict inequality 
Tt ^ > n 2 is infeasible, of course, since the co-player 
can play AllD). 

The Evolution of ZD Strategies 

Let us consider a very large population of play- 
ers, all using the same pair of values (X, k). If a 
small dissident minority using a slightly changed 
pair shows up, it may do better or worse than the 
resident population. If it does worse, it will van- 
ish. If it does better, it will spread, and eventually 
take over. It can easily be shown that it is most 
advantageous for the dissident to keep the slope 
X unchanged, to increase k if the slope is positive, 


and to decrease k if it is negative. This means that 
whenever the payoff values are positively related 
(i.e., when X > 0), then adaptation, in the sense 
above, drives k from Pto R. Evolution leads from 
extortion to generosity. This is a surprising result: 
whereas extortion strategies are those that guar- 
antee that the player s own payoff is not less than 
the co-player s, generous strategies ensure that his 
payoff is never above the co-players. Does this 
suggest, then, that the meek will inherit the Earth? 
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3 Recursively Coloured Triangles 


Consider the following construction: 

Begin with a white triangle of area 1 

At each step, join the midpoints of the edges of 
every white triangle that has an edge contained 
in an edge of the original triangle, and shade 
the resulting triangles. 

After 4 steps, the diagram looks like this: 

How much area is eventually shaded? 


1 Intinite Sequences 


Is it possible to arrange an infinite number of ones 
and zeroes such that no pattern (of any given length) 
is repeated three times in a row (for example 111, or 
111011101110). 















7 Geometry 


There are 6 spherical robot sentinels tloating in a regular 
hexagonal arrangement in a large empty room.They do not 
move, but can observe everything around them, and none of 
them are obstructing their views of each other. 


What proportion of the robots'total surface areas are within 
sight of precisely n of the other robots (for n = 0 to 5)? 


4 Primes 


A deletable prime is a number such that it is a 
prime number, and digits can be deleted one 
by one from its base 10 expansion such that 
a prime is obtained at each step. Which of 
a) 395247625, b) 410139761, c) 410256793 and 
d) 357019711 is a deletable prime? 


c) 10101011101110000000001001011000111100011000000000100010011 
1001011101101101111000111111110001000010100100100001111001001 

® | d) 00000001110111101111000000000101011101110111101110100011001 
0100000010110100000100111101101000000110110110001100010000111 


e) 11111011010001101100000000101011010110001110000001010101100 
0001110010001010111000110110001111101100010111111001000100101 


6 Random Numbers 


5 Subgroups 

Consider the group of possi- 
ble permutations of a Rubik's 
Cube®. What is the order of the 


subgroup generated by rotat- 
ing the middle 'slices' by half 


b) 11101101010110000110011101100011101011011101101001101000010 
1111110011100100111111011010101100001100111001100110110010111 


Four of these are random numbers, and one is an encrypted message. Which 
is the encrypted message? 

a) 00110001110110001010101111100010101000100111101101001110100 
0010101011100010111000101110001010000111011010000000011000100 









Now, each of the 6 robots flies directly towards the robot nearest to it 
clockwise, at a speed of one unit per second, where a unit is the side 
length of the hexagon. 


Assuming the size of the robots is negligible, how long (in seconds) 
will it take for the robots to collide? 


Suppose we have six dots arranged in a line. Find the number of paths through some or 
all (but not none) of these points, subject to the limitations: 

a) No point can be visited twice, 

b) lf a point obstructs another point, the obstructing point should be visited first. 

(e.g. If the points are numbered 1-6from left to right, then the paths "1 - 2 - 3", "2 - 3 -1", 
"3 - 4 - 5 - 2" are permitted, but "1 - 3", "3-4-1" are not.) 


10 BinaryNumbers 


Find (in decimal) the smallest natural number n such that n is odd, the 
number of 1 s in its binary expansion, n v is even, the number of 1 s (n 2 ) 
in n/s binary expansion is odd, the number of 1 s (n 3 ) in n {s binary ex- 
pansion is even, and the number of 1 s in n 3 's binary expansion is odd. 


11 Probabilty 


You have volunteered to undergo the following experiment: On Sunday you 
will be put to sleep. Once or twice, during the experiment, you will be wak- 
ened, interviewed, and put back to sleep with an amnesia-inducing drug 
that makes you forget that awakening. A fair coin will be tossed to determine 
which experimental procedure to undertake: if the coin comes up heads, you 
will be wakened and interviewed on Monday only. If the coin comes up tails, 
you will be wakened and interviewed on Monday andTuesday. In eithercase, 
you will be wakened on Wednesday without interview and the experiment 
ends. 


You have been woken up and are being interviewed. What is the probabilty 
that the coin landed heads? 










13 SomeCake 


What/s the maximum number of pieces you can 
make with 5 straight cuts (without rearranging 
the pieces) on a: 

a) 2D circular cake? 

b) 2Dcrescent cake? 

c) 3D spherical cake? 


There are 9 plates in a line.The four plates on the left contain chocolate cakes.The four plates 
on the right contain banana cakes.There is an empty plate in the middle. 


A move consists of either moving any cake onto an adjacent empty plate, or moving a cake 
past one other cake onto an empty plate. 


What's the minimum possible number of moves to reverse the initial arrangement of the cakes? 


Not all of the questions from the 2013 Problems Drive are contained here. Some of the prob- 
lems here were ndt present in the prctelems drive, or are paraphrased.The intersection between 
these two sets, however, is indeed non-empty! Solutions can be found on page 1110110. 
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The Original Conjecture 

In an article published in Eureka 36 (October 
1973), the unique factorial discovery that there is 
only one digit 5 in 82! was reported (see [1]). The 
following was also conjectured: 

Conjecture (Castell, 1973) 

Lety be the percentage of zeroes in the n\ number 
string S„, x the percentage of any other digit. For 
large n , v andy satisfy: 2x; (r - l)x + y ~ 100, 

where r is the base of the number system. 

For example, for decimal factorials (r = 10) 
this conjecture gave: x = 9.09% (1/11) and 
y = 18.18% (2/11). As was reported in the 1973 
Eureka paper, this result did reasonably accurately 
apply for factorials up to 800!, where n = 800 was 
the computational limit imposed by the technol- 
ogy available at that time. 

Castells Original (1973) Conjecture was derived 
essentially using the following intuitive reason- 
ing: the string Sn is in total not a random num- 
ber, and there is a dehnite extra doubling’ effect 
of the occurrences of ‘trailing zeroes whenever a 
multiplication by a 5 or a 10 occurs (for r = 10); 
however, intuitively, all digits other than a zero in 
Sn are equally probable. The computational data 
available at the time seemed to indicate that ze- 
roes did indeed occur twice as often as any other 
of the digits 1-9. 

New Developments 

The following new standout results have recently 
been discovered by the present authors: 

1. There is no digit B (i.e. '11' in decimal) in 
75! (base 16), a 160 digit number. 

2. The original conjecture does not fit well with 
data for base 16 or base 2, perhaps due to the 
number of factors of each base. In addition, for 
larger factorials than were computationally avail- 
able in 1973, the original conjecture is inaccurate 
for base 10. 


3. A new conjecture: 

Conjecture (Castell and Khaleghpour, 2010) 

Let y be the percentage of zeroes in the n\ num- 
ber string S„, x the percentage of any other digit. 
For large n, x and y satisfy: y ~ (0.075r + 1.05)x; 
(r - l)x + y ~ 100, where r is the base of the num- 
ber system. 

This gives the following table, a very good fit to 
the n\ data we have obtained: 


Base r 

X 

y 

2 

54.55 

45.45 

10 

16.66 

9.26 

16 

13.00 

5.80 


A NewChallenge 

We challenge the interested reader to prove or dis- 
prove the above revised conjecture. In particular, 
we are interested in whether the non-zero digits, 
for large n , do generally occur in equal propor- 
tion within each factorial number string (see [2] 
for contrast, which considers the pattern of lead- 
ing digits of factorials). It may additionally be of 
interest for the reader to consider our discovered 
'anomalous' factorials (82!, base 10 - only one dig- 
it 5; and 75!, base 16 - no digit B) from a formal 
statistical likelihood’ perspective; and also to in- 
vestigate whether there are any further factorials 
that stand out for any reason. 

References and Notes 

[1] Stephen P. Castell; 1973; On the Distribution 
of Decimal Digits in n!; Eureka, 36, 45-47; http:// 
www. archim. org.uk/archives/ eureka/#36. 

(This discovery that there is only one digit 5 in 82! 
was subsequently designated by Computer Bulle- 
tin as being ‘the most useless fact discovered by a 
computer.) 

[2] John D. Cook; Leading digits of jactorials; 
http://www.johndcook.com/blog/2011/10/19/ 
leading- digits - of- factorials/. 

The authors are happy to be contacted with any 
thoughts on this topic at, respectively: 
cstll01@attglobal.net and f.khp01@gmail.com. 
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Figure 3 "Ecce Homo" (left) and "Ecce Mono" (right) 


(c) Restored image with local inpainting (d) Restored image with global inpainting 

Figure4 Mathematical image restoration of "Ecce homo". (d) is courtesy 
of Rob Hocking, using the algorithm described in [1] and [2]. 


(a) Maskfor restoration 


(b) Initialisation of the algorithm with random colours 
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Image Restoration 

Dr Carola-Bibiane Schonlieb 
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I n our modern society we encounter digital im- 
ages in a lot of different situations: from eve- 
ryday life, where analogue cameras have long 
been replaced by digital ones, to their professional 
use in medicine, earth sciences, arts, and security 
applications. The images produced in these situ- 
ations usually have to be organized and possibly 
postprocessed. The organization and processing 
of digital images is known under the name of im- 
age processing or computer vision. 

We often have to deal with the processing of im- 
ages, e.g., the restoration of images corrupted by 
noise, blur, or intentional scratching. The idea 
behind image processing is to provide methods 
that improve the quality of these images by post- 
processing them. For an introduction to digital 
image processing we refer to [3] or [8]. Virtual 
image restoration or image interpolation (also 
referred to as "inpainting") denotes the method- 
ology whereby missing parts of damaged images 
are hlled in, based on the information obtained 
from the intact part of the image and a priori 
assumptions made on the missing image struc- 
tures. Virtual image restoration is an important 
challenge in our modern computerized society: 
From the reconstruction of crucial information in 
satellite images of our earth to the renovation of 
digital photographs and ancient artwork, virtual 
image restoration is ubiquitous. Considering this 
huge - but by no means complete - range of im- 
age processing applications and the fact that there 
are still problems in this area which have not been 
satisfactorily solved, it is not surprising that this 
is a very active and broad held of research. From 


mathematicians, to engineers and computer sci- 
entists, a large group of people have been and are 
still working in this area. 

The Digital Image: 
a Mathematical Object? 

In order to appreciate the following theory and 
the image processing applications, we hrst need to 
understand what a digital image really is. Rough- 
ly speaking, a digital image is obtained from an 
analogue image (representing the continuous 
world) by sampling and quantization. Basically 
this means that the digital camera superimposes 
a regular grid on an analogue image and assigns 
a value, e.g., the mean brightness in this held, to 
each grid element. In the terminology of digital 
images these grid elements are called pixels. The 
image content is then described by grey values or 
colour values prescribed in each pixel. The grey 
values are scalar values ranging between 0 (black) 
and 255 (white). The colour values are vector val- 
ues, e.g., (r, g, b), where each channel r, g and b 
represents the red, green, and blue component of 
the colour and ranges, as the grey values, from 0 
to 255. 

The mathematical representation of a digital 
image is a so-called image function u dehned 
on a two dimensional (in general rectangu- 
lar) image domain, the grid. Indeed, in some 
applications, images are three dimensional (e.g. 
videos, 3D medical imaging) or even four dimen- 
sional (involving three spatial dimensions and 
time) objects, but for simplicity we focus on the 
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two dimensional case for the following conceptu- 
al presentation. The image function is either sca- 
lar valued in the case of a grey value image, or vec- 
tor valued in the case of a colour image. Here the 
function value u(x, y) denotes the grey value, i.e., 
colourvalue, of the image in the pixel (x, y) of the 
image domain. Figure 1 visualizes the connection 
between the digital image and its image function. 

Typical sizes of digital images range from 
2000 x 2000 pixels in images taken with a simple 
digital camera, to 10 000 x 10 000 pixels in im- 
ages taken with high-resolution cameras used by 
professional photographers. The size of images in 
medical imaging applications depends on the task 
at hand. PET for example produces three dimen- 
sional image data, where a full-length body scan 
has a typical size ofl75xl75x 500 pixels. 

Now, since the image hmction is a mathematical 
object we can treat it as such and apply mathemat- 
ical operations to it. These mathematical opera- 
tions are summarized by the term image process- 
ing techniques, and range from statistical methods, 
morphological operations, to solving a partial dif- 
ferential equation for the image function. We are 
especially interested in the last, i.e., PDE - and 
variational methods used in virtual restoration. 

We have introduced a digital image as a sampled 
and quantised version of an analogue (also called 
physical or real) image. The higher the resolution 
of a digital image, the closer it is to the analogue 
image in the real-world. While digital image pro- 
cessing is indeed concerned with digital images, 
the methods used are often motivated from con- 
siderations in the continuum, that is methods are 
formulated for the analogue image. In this article 
we take up this mathematically more challenging 
and analytically more beautiful position, and let 
our image wbea continuous object dehned on a 
rectangular domain Cl = (a, b) x (c, d). Within this 
framework, there are many possibilities for how 
images can be modelled, compare [8], Chapter 
3. For our purposes we will focus on the repre- 
sentation of images as elements in a function 
space such as the Lebesgue space L 2 (Q), Sobolev 
spaces such as H^^Cl) and the space of functions 
of bounded variation BV(Cl). The latter space is 
especially suited for images since an element in 
BV can be discontinuous and hence the represen- 
tation of image edges is possible. 


Local and Global Features: 
What is Important in 
Image Restoration? 

An important task in image processing is the pro- 
cess of hlling in missing parts of damaged or oc- 
cluded images based on the information obtained 
from the intact parts in the image. It is essentially 
a type of interpolation and we will refer to it as 
virtual image restoration or inpainting (various 
terminologies are used for image interpolation 
depending on the application). 

Let / represent some given image dehned on an 
image domain fl. Loosely speaking, the problem 
is to reconstruct the original image u in the (dam- 
aged) domain D of fl, called inpainting domain or 
a hole/gap (cf. Figure 2). 

Virtual image restoration methods can be roughly 
divided into two groups: 1) local inpainting, and 
2) global inpainting methods. The main differ- 
ence between these two classes lies in the type 
of image information used from the intact part 
of the image, as well as the different kind of in- 
painting processes with which this information is 
propagated into the missing domain. 

A method is local if the information used to hll 
in D is only taken from of the boundary dD (or a 
small neighbourhood of D). In a local inpainting 
method the restored image u can be formalised 
as a solution of either a variational problem or a 
partial differential equation (PDE). The easiest ex- 
ample is harmonic inpainting, where: 

u e argmin^ |||Vv||^q) : such that v = f E (l \ I)| (l) 

Equivalently, the hrst-order optimality condi- 
tion for the above variational problem (Euler-La- 
grange equation) states that the restored image u 
solves the Laplace equation 

A u = 0 in D 
u = / on dD. 

As such, u can be seen as the harmonic exten- 
sion of / from dD into D. Of course, any image 
structures such as image edges are not preserved 
by the harmonic extension (rather diffused into 
D). More sophisticated local inpainting methods 
have been proposed in the community during the 
last fifteen years that are able to propagate geo- 
metric image information such as object edges, 
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Figure 1 Digital image versus image tunction: Gradually zooming in to the level where the image pixels are visible (blue 
tramed detail), the image tunction of the red channel u(x,y, r) of the digital photograph is plotted as the height (the value 

for red) overthe (x,y)-plane. 


their orientation and curvature. These approaches 
are mainly based on extensions of (1) and (2) to 
non-smooth yariational problems and nonlinear 
(and often higher-order) PDEs, respectively. For 
different types of local inpainting methods we 
refer the reader for instance to [4], [5], [12] (in- 
painting via transport), [14], [6] (TV inpainting), 
[9] (curvature driven diffusion inpainting), [11] 
(Mumford-Shah based inpainting) and [13], [15] 
(Euler s elastica inpainting). 

What all of these methods have in common is 
that they can only reproduce local image features 
(encoded in the value of the image function and 
its derivatives in a pixel) but are not able to pick 
up image patterns or texture which are non-local 
image features. Global (or non-local) inpaint- 
ing methods take into account all the informa- 
tion from the known part of the image, usually 


weighted by its distance and similarity (measured 
in a certain way) to a neighbourhood of the point 
that is to be hlled in. Such methods usually work 
on image patches rather than on single image pix- 
els. They are mathematically formalised as non- 
local variational problems and engineering-type 
discrete algorithms. This class of methods is very 
powerful, able to hll in structures and textures 
almost perfectly. However, they still have some 
disadvantages. One major one is the high compu- 
tational cost involved in their solution. For some 
of these methods another disadvantage is their 
dependence on an initial guess for the restored 
image in D. Local methods are sometimes more 
desirable, especially when the inpainting domain 
is relatively small. If D is large a local method can 
serve as a good initialisation for the global in- 
painting method. For more discussion on global 
methods the readeris referredto [7], [10], [1], [2]. 


101101 










Mathematical Algorithms 
Versus an Amateur's Attempt 

In August 2012 Cecilia Gimĕnez, an eighty year 
old amateur artist from a small village near 
Zaragoza (Spain) gained fame by an attempt to 
restorate a wall painting in a local church. She 
produced the by now famous painting dubbed 
"Ecce Mono" (Behold the Monkey) when aiming 
to restore the wall painting "Ecce Homo" (Behold 
the Man) by the spanish painter Elias Garcia Mar- 
tinez (see the comparison in Figure 3). 

Lets see what virtual image restoration methods 
make of this. In Figure 4 a local and a global in- 
painting result for the head of the Jesus hgure are 
shown. For the local inpainting we used higher- 
order total variation inpainting (see [6]) and 
for the global inpainting method a variational 
exemplar-based method with the L 1 -norm as sim- 
ilarity measure between image patches (see [2]). 
Local inpainting is doing pretty well in recovering 
the main structures in the painting but smooth- 
ing out small-scale features and texture. Being ini- 
tialised with the local inpainting result, the global 
inpainting method performs reasonably well. We 
leave it to the reader to decide which restoration 
is more realistic: Cecilias “Ecce mono” in Figure 
3 or the mathematically formalised inpainting in 
Figure 4. 


Condusion 

Mathematical concepts such as nonlinear PDEs 
and variational calculus offer a beautiful and rich 
framework for formalising and solving real world 
problems in imaging. Of course, virtual image 
restoration cannot (yet or never?) replace human 
expertise. In fact, virtual image restoration algo- 
rithms have been very much intluenced by the 
experience and guidelines of art restorers, aiming 
to formalise what art restorers do mathematically 
(see [4]). Virtual image restoration can also help 
art restorers by producing digital templates for 
damaged art pieces. 
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Globf«l inpainting: involve image 
intormanon from the entire iniact part 



and propagate Lt inwards 


Figure 2 Virtual image restoration: based on the intact image intormation f inside 0\D, one seeks the inpainted image 
u that extends f into the inpainting domain D. The difference between local and global inpainting lies in its conceptu- 

ally different method of recovering u from f. 
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Seahorse Valley 

The Mandelbrot set contains smaller cop- 
ies of itself known as satellites. Here we 
zoom into the cleft of a first order satellite 
to find the same "seahorse" structure as 
in the main deft, with an additional layer 
of complexity shown in white and blue. 
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I t might seem somewhat implausible that func- 
tional analysis, non-Euclidean geometry and 
statistical shape analysis have much to tell us 
about the spread of European languages. Histori- 
cal linguistics has traditionally been something 
of a qualitative discipline, but recently there has 
been considerable interest in taking a more quan- 
titative approach to the subject, through textual 
analysis but also through the analysis of acoustic 
recordings. It is this latter data that has allowed 
some more unusual links between mathematics 
and phonetics to be made. 

Acoustic recordings yield considerable quantities 
of data which to all intents and purposes can be 
seen as continuous over time. For example, in Fig- 
ure 1 below, a two dimensional surface (spectro- 
gram) can be seen, where the hrst axis represents 
time while the second represents the frequency of 
the sound wave being recorded. This spectrogram 
not only conveys all the time and frequency infor- 
mation contained in the word being said, but can 
be treated (when suitably normalised) as a ran- 
dom element, say X, X e L 2 . 

Functional Data Analysis 

The relatively new held of hmctional data analy- 
sis (FDA) (see [2], [4]) is something of a cross 
between functional analysis and classical statis- 
tics. Unlike the usual univariate or multivariate 
analysis undertaken in most statistics, FDA is 
the branch of statistics that concerns data where 
the fundamental unit of that data is a function in 
some suitable (often inhnite dimensional Hilbert) 
space. The main idea is to use the properties of 
smoothness and regularity in the hmctions to 
allow statistical analysis to be carried out, even 
though the hmctions are only ever discretely ob- 
served with noise. 

One of the most important quantities in FDA is 
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Figure 1 An example of a raw spectrogram (in logarith- 
mic scale) as obtained by taking a windowed discrete fou- 
riertransform of a 22kHz sound sample of a single syllable 
(the word one ("un") in French). The tourier transform was 
computed every 10 ms to yield the discretised version of 
the function. 

the covariance operator. For a random square 
integrable function X, with E(X) = 0, the opera- 
tor C(y)=JE{{X I v)X),y e I 2 is dehned as the co- 
yariance operator, where is the usual inner 

product in L 2 (see [2] for more details). It is, by 
dehnition, non-negative dehnite, and in many 
data analysis situatinm k ?>.<;.<;umed to he a trace 
class operator, i.e. T. X , < ao, where A, are the 
eigenvalues of the speciral decomposition of the 
operator. In many situations, FDA proceeds by 
using one of the fundamental theorems of Sto- 
chastic Processes, the Karhunen-Loeve decompo- 
sition of the operator, to provide a basis for expan- 
sion of the data. This allows a possible dimension 
reduction on the data to be performed, something 
that has been common in multivariate statistics 
since the early 20th century, in the hnite dimen- 
sional setting. However, in the case of examining 
differences between languages, it is the operator 
itself which will be of interest. 

Assume that we are interested in understanding 
the relationship between languages through their 
acoustic properties. Given a set of recordings for 
a particular language, spectrograms can be pro- 
duced and an estimate of the associated covari- 
ance operators obtained. Languages, of course, 
have many characteristics, but it has been shown 
that one characteristic of interest is the variational 
patterns that are present in the sounds. These dif- 
ferences are captured exactly by the covariance 
operators. Therefore by comparing covariance 
operators we can provide one particular compari- 
son of the languages themselves. 


Statistics in Non-Eudidean 
Spaces 

However, covariance operators are not the usual 
type of data that statistical analysis is designed for. 
They are non-negative dehnite trace class opera- 
tors, so do not lie in a standard “Euclidean” space. 
The usual Euclidean metrics used in statistical 
analysis, extended to FDA, are not valid given 
the restricted space. This requires a new type of 
metric to be used, one with its roots in statisti- 
cal shape analysis (see [1]), where non-Euclidean 
geometry is commonplace. 

Let us start by considering a closely related hnite 
dimensional problem, dehning a distance be- 
tween two positive dehnite matrices. Possibly, the 
simplest approach to take would be to take the 
matrix logarithm and compute the usual Frobe- 
nius norm between the matrix logarithms. This 
is indeed a Riemannian distance on the space of 
positive dehnite matrices, and as such allows sta- 
tistical analysis to be developed. However, even if 
our covariance operators were positive dehnite, 
their trace class nature implies that their eigenval- 
ues tend to zero, and hence the equivalent of the 
matrix logarithm is unbounded. However, this is 
not the case for all transformations. For example, 
the square-root transformation is well dehned 
and the resulting operator, while not guaranteed 
to be trace-class, is still a Hilbert-Schmidt opera- 
tor, and as such the distances are still well dehned. 

The square-root of a matrix, or operator, is, how- 
ever, not uniquely dehned. It would be somewhat 
more elegant if the distance between two lan- 
guages was independent of the choice of square- 
root. This is a well studied problem in statistical 
shape analysis, where the equivalent problem is 
that of how to match shapes that are subject to 
rotation and translation. The shape of dog is still a 
dog, whether it is standing with its head to the left 
or to the right. Equivalently the uniqueness of the 
square-root is dehned up to its rotation, and as 
such by quotienting out the rotation group we ob- 
tain a unique distance. These ideas yield the fol- 
lowing metric to measure the distances between 
our languages. For two covariances C x and C 2 , the 
Procrustes metric is dehned as 

d F {C„C 2 f= inf IIA-MllL 
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where L { are such that C. = L { L -, for i = 1, 2, and 
0{L 2 (Q)} is the space of unitary operators on L 2 . 
Procrustes was the Greek innkeeper of myth who 
htted everyone to his iron bed by either stretching 
or chopping them to size, and as such this met- 
ric equivalently gives a distance that disregards 
the orientations of the initial estimates of the co- 
variance operator. This distance, although some- 
what complex, has a simple closed form solution, 
where, for any choice of L 2 satisfying the above, 

d p (C v C 2 ) 2 =||l 1 || 2 HS +|i 2 llHS - 2 Z^- 

k^l 

where o k f he singular values of the compact 
operator L 2 L. . It can be shown that, even when 
there are only hnite amounts of discretised data 
present, the estimates of the distance converge as- 
ymptotically. 

Investigating the Relation- 
ships in Romance Languages 

The statistical analysis of non-Euclidean and func- 
tional data are of interest in and of themselves, 
and are some of the fastest growing areas of mod- 
ern statistics. However, this is in many ways be- 
cause of their ability to be used to give insights 
into other areas such as historical linguistics. In 
a recent study, recordings of the pronunciation of 
the numbers one to ten were taken from four dif- 
ferent romance languages (French, Italian, Span- 
ish and Portuguese) with one language having 
two different dialects present (Iberian Spanish 
and American Spanish). 219 spectrograms (there 
were several repetitions of each word in each lan- 



Figure 2 Representation of the geodesic taking a speaker 
saying the French word "un" and turning it into the Portu- 
guese word "um". The geodesic is based on the Procrustes 
metric in the space of covariance functions. 


guage) were generated from the sound samples, 
and preprocessed to form aligned hmctions from 
which covariances were formed. The distances be- 
tween these covariances were then examined. 

It is possible to use the Procrustes metric to not 
only dehne distances between covariances but 
also by extension to dehne geodesics within the 
space of covariance functions (see [3]). These can 
then be used to dehne covariances for languages 
“between” any two of the observed languages or 
even to predict how one speaker might sound 
when speaking another language. Figure 2 shows 
one such predicted path. Here a speaker saying 
the word “un” (one in French) is mutated along 
a geodesic path into saying the word “um” (one 
in Portuguese). The speaker characteristics are 
retained but the variations attributed to the lan- 
guages are captured via the geodesic path. These 
spectrograms can then be transformed back into 
audio to hear the results. This opens up a world 
of possibilities of discovering how one language 
might be related to another, or how historical lan- 
guage groups might have evolved into modern 
day languages. 

The integration of concepts from geometry, analy- 
sis and other areas of mathematics into data anal- 
ysis through statistics has a long history. However, 
modern data sources are constantly raising new 
challenges and areas such as non-Euclidean FDA 
are being developed for applications as diverse as 
brain imaging to those seen here in linguistics. 
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G eorge Polya (1887-1985) was a Hungar- 
ian mathematician who made major con- 
tributions to the helds of number theory, 
combinatorics, probability, analysis and many 
more (see [3]). Tbe following theorem is said to 
have been Erdoss favourite result proved by Polya: 

Theorem 1 Let f(z) = z n + a n _ x z n ~ l + . . . + a 0 be 
a complex monic polynomial of degree n and let 
C= {ze C: \f(z)\ < 2} be the set of points mapped 
within the circle of radius 2 with centre the ori- 
gin. Then the total length of the projection C L of 
C onto any line L in the complex plane never ex- 
ceeds 4. 


Lets look at an example. Take the complex func- 
tion f(z) = z 2 - 1. We solve \f(z)\ < 2 to obtain the 
following region of the complex plane: 



The boundary of the region has equation 

y 2 = 2Vl + x 2 — 1 — x 2 . 


If we wish to fmd the "longest" projection of this 
region onto a line L, we take L parallel to the 
x-axis: 

A 



Cl 


We then get that the length of the projection is 
Cl = 2a/ 3 < 4, as expected. We will now start 
proving this by taking simpler cases and then gen- 
eralising. We will restrict ourselves to real poly- 
nomials and we will only take the projection of 
C onto the x-axis, i.e. we will prove the following 
theorem: 

Theorem 2 Let/(x) = x n + a n _ x x n ~ l + ... + a 0 be 
a real monic polynomial of degree n with n real 
roots and let C = {x e M : \f(x)\ < 2} be the set of 
points mapped within the interval [-2, 2]. Then 
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C can be covered by intervals with total length at 
most 4. The reader can easily see how this relates 
to Polyas theorem. We will see that most of the 
work is in proving this special case and the transi- 
tion from real to complex numbers is natural and 
easy. 

Chebyshev's theorem 

We will use the following theorem by Chebyshev 
(see [2], pp.14): 


Theorem 3 (Chebyshev) (proof omitted) Let 

P(x ) be a real monic polynomial. Then 

max P(x) > 

a<x<b 2 

where n is the degree of P(x). 


This is a fact that seems completely unrelated to 
Polyas theorem but it leads us to the following 
corollary: 

Corollary If \P(x)\ < 2 for all x e [a, b], 
then b - a < 4. 


Proof of Corollary This can be obtained by map- 
ping the interval [a, b] onto [-1, 1] by substitut- 
ing y = ^(ir - a )~ l an d applying Chebyshev's 
theorem to P(x) as a polynomial Q(y) in y (Q isn't 
monic but we can scale it to make it monic). Then 
since 

max P(x) = max Q(y) 

a<x<b -1 <y<l 


we obtain 

2 > max P(x) > 

a<x<b 



scaling 


which yields b - a< 4, as required. 


Now this looks more like it is going somewhere 
towards our goal! It shows that Theorem 2 is 
true if C x is an interval (instead of the more gen- 
eral union of disjoint intervals I x u I 2 u . . . u 4 
= [a x , b x ] u [a 2 , b 2 ] u ... [a k , b k ], with total length 

Kh) + • • • *( 4 ))- 

Polya's Idea 

Now what Polya did was to try and con- 
struct another real monic polynomial P(x) 
of degree n such that the projection onto the 
x-axis is an interval of length at least 


/(4) + ... + l(I k ). Here is how we do this: We have 
a polynomial P(x) = (x - x x )(x - x 2 ) ... (x - x n ) 
with C = {x e R : \P(x)\ < 2} = I x u I 2 u . . . I k , 
where the intervals are arranged in ascending or- 
der. After some elementary (but not necessarily 
easy) observations, we claim that the endpoints 
of the intervals have functional values +2 and -2 
and that all of these intervals contain a root of the 
polynomial (see [1], pp. 141-142). We prove this 
by assuming P(x) = 2 at both endpoints and look- 
ing at the hrst and second derivatives at a critical 
point (which exists by Rolles theorem). Then we 
us ep'(x) 2 > p(x)p u (x). 

Now suppose 4 contains m roots of our poly- 
nomial (with their multiplicities), name- 
ly x\, x 2 , ... , x m , and let the rest of the roots 
be y\> y\-> ... > y' n -nr Of course we assume that 
m < n, otherwise C L would be a single interval and 
we would be done. Let d be the distance between 
h and I k _ u i.e. d = a k - b k _ v 


1 1 7 2 4 - 1 "-'-'"' 4 

d 

Since the xs and the y's are roots, we can write 
P(x) = Q(x)R(x) where Q(x) = (x- x\)... (x - x m ) 
and R(x) = (x - y\)... (x - y n _ m ) (we are "splitting" 
P(x)). Now we construct the following polynomi- 
al (new line because it is important): 

P\(x) = Q(x + d)R(x) 

We can see that P^(x) has roots x\ - d,x 2 - d,..., 
x'm - d,y\,y 2 ,■■■ , y'n-m ■ Now Cj = {x: |44)| < 2} 
contains the intervals I x , I 2 ,..., I k _ x and I k _ d . Thats 
because given a point x e 4 u ... u 4-i, we have 
\x - x\ + d\ = -x + x\ - d < -x + x\ = \x - x\\ since 
x < x- - d for x in the intervals under consid- 
eration, so |Q(x + d)\ < |Q(x)| and then again 
\Pi(x)\ < \P(x)\ < 2. Similarly if x e I k we look 
at R(x): |R(x - d )| < |R(x)| and so |P x (x - d)\ = 
|Q(x)||R(x - d) | < |Q(x)||R(x)| = |P(x)| < 2. Here 
the last two intervals get "glued together" form- 
ing a single interval, so now we have Pi(x) with 
only k - 1 intervals. By induction we can reach 
the desired P(x). Note that the set of values of x 
such that P(x) < 2 is not the same as the set for our 
original polynomial, but instead has total length 
at least the total length of C for P(x). Thus Theo- 
rem 2 is proved. QED. 

To help understand the process, we 
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George Polya (1887 - 1985) was a Hungarian mathematician who madefunda- 
mental contributions to a wide range of topics, including series, number theory, 
mathematical analysis and geometry. He is also widely know for his work in 
heuristic reasoning, writing many books on the subject, including the famous 
How to Solve It. 


"Beouty in mathematics is 
seeing the truth without effort." 




Paul Erdos (1913-1996) was among the most 
prolific mathematicians of all time, working 
with hundreds of collaborators. He was also 
known for his eccentric personality. He spent 
much of his life as a vagabond, often turning 
up at colleagues' homes and announcing that 
his 'brain was open', and staying for a few days 
to collaborate, before moving on. 

"Another roof, anotherproof" 


will look at another example. Let's take 
P(x) = x 3 - 5x 2 . We have two intervals 
I x - [-0.6, 0.68] and I 2 « [4.92, 5.08], with 
d ~ 4.24, which satisfy \P(x)\ < 2. The second in- 
terval contains the root x = 5 while the hrst con- 
tains the double root x = 0. Therefore we split 
P(x) into Q(x) = x - 5 and R(x) = x 2 . We then 
form Pi(x) = Q(x + d)R(x) = (x + 4.24 - 5)x 2 . 
Now we can check that |Pi(x)| < 2 only has one 
interval as a solution, namely [-1.05, 1.57], 
which contains the intervals f~ [-0.6, 0.68] and 
I 2 -d~ [0.68, 0.84]. 

From Real Line to Complex 
Plane 

First of all, let's crack the scary every line 
L in the complex plane bit. It takes very lit- 
tle thought to hgure out that we only need to 
prove the theorem for L being the real line and 
all the other lines can be obtained by rotating 
the plane. Thus we conclude that this part isn't 
as impressive as it sounds and we don't even 


need to worry about it. Now let's take our 
complex polynomial f(z) from Theorem 1. Let 
C R = {x e R : x = Re(z) and \f(z) \ < 2}. We can write 
f(z) = (z - zf)(z - z 2 ) ... (z - z n ) where z t = a { + i b { 
are the roots of f(z). Now if we consider the "real" 
version g(x) = (x - a x )(x - a 2 ) . . .(x - a n ) of f(z), 
we have \x - a\ 2 + \y - h\ 2 = \z - z\ 2 by Pythagoras' 
theorem. Hence \x - a\ <\z - z\ for all 1 < i < n. 
This means that \g(Re(z))\ < \f(z)\ < 2 for z e C. 
Or, after taking a few seconds to think and let the 
above sink in, C R c: {x e R : |g(x)| < 2}. Thus Theo- 
rem 2 actually implies Theorem 1 (Polyas result). 
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H ow would you try to solve a linear system 
of equations with more unknowns than 
equations? Of course, there are inhnitely 
many solutions, and yet this is the sort of prob- 
lem statisticians face with many modern datasets, 
arising in genetics, imaging, hnance and many 
other helds. What s worse, our equations are often 
corrupted by noisy measurements! In this article 
we will introduce a statistical method that has 
been at the centre of the huge amount of research 
that has gone into solving these problems. Well 
begin by reviewing the classical version of the 
problems, before moving on to the more modern 
setting hinted at above. 

Regression Analysis 

Imagine data are available in the form of observa- 
tions (Y iy Xi) e M x W, i = 1,..., n, and the aim is 
to infer a simple regression function relating the 
average value of a response , Y iy and a collection of 
predictors or variables> x t . This is an example of re- 
gression analysis, one of the most important tasks 
in statistics. 

Often, we may assume that the unknown regres- 
sion function is linear in the predictors, giving the 
following mathematical formulation of the prob- 
lem: 

Y = Xf + e (1) 

where Y e is the vector of responses; X e R nxp 
is the predictor matrix with / th row x]; e e IR n rep- 
resents random error; and (3 e W is the unknown 


vector of coethcients that determines the regres- 
sion function and is to be estimated using the data. 

A traditional application of the model (1) may 
have the responses as blood pressure measure- 
ments for n = 100 patients and the predictors 
could include height, weight, age and daily calorie 
intake, for example. In this case, one might esti- 
mate f by ordinary least squares (OLS), a tech- 
nique dating back to Gauss (1795). This yields an 
estimator j5 OLS with 

j6 0LS := argmin||y - Xb\\l = (X T X)~' X T Y, 

provided X has full column rank. Here ||*|| 2 de- 
notes the Euclidean norm. We can analyse the 
quality of the estimate of the regression function 
by calculating its mean-squared prediction error 
(MSPE). Under the assumptions that (i) E(e. ) = C 
and (ii) Cov(£., e. ) = , it holds that 

MSPE(/3 OLS ) := e{-||*(/} - /?° LS )|f ) = -a*. 

U J n 

We see that provided p/n is small, the MSPE is 
small. When this is true, and under the assump- I 
tions given above, OLS is a very reasonable choice I 
of estimator. Indeed, the Gauss-Markov theorem 
shows that it has the minimum MSPE among all 
linear unbiased estimators of the regression func- 

A 

tion, i.e. among all estimators F := AY of Xf, for 
some nxn matrix A such that E(^E) = 
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High-dimensional Data 



> ■ 




One might think that OLS essentially solves the 
problem of linear regression, at least under as- 
sumptions (i) and (ii). However, the held of sta- 
tistics must constantly adapt and innovate to 
develop methods that accommodate the data it 
is tasked with to study, and today, much of that 
data is high-dimensional : p is very large and often 
greatly exceeds n. Where in the past only a few 
carefully chosen variables were measured for each 
observation, nowadays any variable that might 
conceivably have an effect on the response tends 
to be recorded, leading to ratios of p/n> 1000 be- 
ing common in some areas. Out of these many 
variables, it may be only a few that are really im- 
portant for predicting the response, but of course 
which these are would not be known in advance. 
In the context of our model (1), this would trans- 
late as /3 being sparse, i.e. many of its components 
being exactly 0. 

How are we to proceed with analysis given such 
datasets? Clearly OLS is unhelpful when p>n\i 
X has full column rank, as then predictions are 
simply the original responses themselves. Ide- 
ally, we would like a sparse estimator that hrst at- 
tempted to detect the relevant variables, and then 
estimated just their coehicients. Even when /3 is 
not truly sparse, it can make sense to produce a 
sparse estimate for it. A hnal estimate with only 
a few non-zero coethcients may be much easier 
to interpret, and given a new observation x e W, 
computing the estimate of the regression function 
at x would be much faster. 

In view of this, one might consider trying to esti- 
mate by j3 BS (Best Subsets) dehned as the mini- 
miser of a penalised least squares objective: 


___ 

A major problem with the estimator j3 BS is that the 
optimisation in (2) is in general computationally 
infeasible as one would essentially need to evalu- 
ate the objective at all 2 p possible subsets of vari- 
ables in order to guarantee hnding the optimiser. 
As p runs into the hundreds, the number of com- 
putations required to perform the optimisation 
quickly surpasses the number of atoms in the ob- 
servable universe! 


p argmin 




( 2 ) 


where 


:= 


iio - “*=r<b ti »or Lar S e values of the 
regularisation parameter , X, will cause j3 to have 

very few non-zero components, and lower values 
will produce less sparse models. Several methods 
are available for choosing X; we will not go into 
the details here. 


The Lasso 

The key property of the objective in (2) which 
makes the optimisation intractable is that it is 
non-convex. Convex problems are much easier to 
solve, one reason for this being that any local op- 
timum is also a global optimum. It is thus sensible 
to consider convex approximations to the objec- 
tive in (2). One such annroximation results from 
replacing ||/>|| # with \\b\[ - 2'., , so our esti- 

mator j3 is given by 


:= arg min 


{i||r-j®6ti|i|| 1 J, 


(3) 


This is the Lasso (Least Absolute Shrinkage and 
Selection Operator) estimator (see [2]): one of 
the most popular methods in high-dimensional 
data analysis. Applications of the Lasso and re- 
lated methods range from identifying which of 
our thousands of genes are related to particular 
diseases, to the click-through rate prediction task 
that optimises web advertising for search engines. 

The optimisation in (3) can be solved even for 
very large problems where p is hundreds of thou- 
sands. Yet importantly, sparsity of the estima- 
tor is retained. This is essentially because the set 
{b : \\b\h < r} for r > 0 has corners; see [1] for de- 
tails. 

Sparsity and computational feasibility are attrac- 
tive properties, but what really makes the Lasso 
appealing is its remarkable performance as both 
a selector of important variables, and as a predic- 
tion engine. This is perhaps surprising given that 
the Lasso optimisation only approximates (2). A 
vast amount of work has gone into trying to un- 
derstand why the Lasso works so well, and also 
into developing improvements and adapting the 
method to suit many other problems. We will fin- 
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ish with a theorem that forms part of that work: 
we show that provided \ \P\\i is small, as would be 
the case when f is sparse, p can be almost expo- 
nential in n, and the Lasso can still estimate the 
regression function well. This is a really rather 
striking result, considering that OLS requires p 
to be much smaller than n. In fact much more is 
true; the interested reader is directed to [1]. 

Prediction Error of the Lasso 

A 

Theorem Let /3 be the Lasso estimator (3) with 
X = c TyJl{\Qgp + t 2 )f n. Assume that the errors 
are independent and normally distributed. Then 
with probability at least 1 - e ' > we have 


-||Pf(j9-jS)|; s2tr\\P\\ 


^ 2(logp + /~ 


Proof By the dehnition of j$ we have 

n n 

Recalling that Y = + £ and rearranging, we get 


-w x{P-m\z-2 s t x(p-p) + a\\ p ii, -a ii p ii, 

n n 


Denote the k th column of X by X k e R n . Now 

1 


-e t X(P -p)<-\\p- p\[ max \x T t£ L 

n n 

and as e ~ N n ( 0, o 2 I) and ||X fc ||2 = n, we have 
X T k e/n ~ N( 0, o 2 /n). Now let Z ~ N( 0,1) so oZHn 
has the same distribution as X T k eln for each k. We 
argue 

p (s \ xT A tn - A ))= f [ U{bH ; n - A } 

<£p(|^ £ |/«>a) 

k=\ 

= > Xyfn /orj. 

A standard tail bound for normal random vari- 
ables (nroved below) gives us that for £ > 0, 

1 - <[>{£) < e * / 2. Applying this to the above, 

weseethatthehnal term in thelastdisnlavisbound- 
ed above by pexp{-MA / (2<r )} = <? Now 
working on the event {niax lsrJt£/j / /i| < 


which we have shown has probability at least 
! - e , we have from (3) that 

-\\x(p-p)&z4p-pI+4pI-*-MI- 

n 

Noting that ||/? - p\[ ~ ||j3||, < ||jS||, by the trian- 
gle inequality completes the proof. 

Standard NormalTail Bound 

Theorem Let Z ~ N(0,1). Then 
1 - cD{4|) := P(Z > C) < \e' Ql ' 2 when £ > 0. 
Proof Let /(O - 1-0(0 - U' ? Now 
P(Z>C) = -=J= f*e'‘’' 2 dz 

<— . 1 f se~ ! ' ,2 <is 

C\2r~ 2 


'c&. 


r^n 


JT< 7 ~ 


Thus if C - ^ then /{£) S 0. Also, 

/(0) = 0. Finally observe that 

/(£) = - 






so /([<)< 0 for f < yjl! {xu*). Conclude that 
/(/) < 0 when £ > 0. 
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The Pythagoras trees are fractals constructed from squares. 
Each triple of touching squares endoses a right-angled triangle, 
thus demonstrating Pythagoras'Theorem inhnitely manytimes. 
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R ecent observations indicate that the Uni- 
verse as a whole is dominated by unknown 
.substances. Dark energy is needed to make 
the expansion rate increase with time. The precise 
way in which this happens indicates that it makes 
up -68% of the total. It is thought to be spread 
completely uniformly and may be a property of 
spacetime itself. Other arguments indicate that 
baryons (protons and neutrons) can only make 
up 5%. The strongest argument comes from nu- 
clear reactions between protons and neutrons 
shortly after the Big Bang. These reactions gener- 
ated a large amount of helium and traces of deu- 
terium. The amounts produced would be altered 
if there were more baryons. Today, therefore, the 
existence of dark matter seems indisputable. 


Baryons 

5% 



Figure 1 The total mass-energy content of the Universe 


Where Dark Matter Might Be 


Dark matter is thought to be an undiscovered fun- 
damental particle which cant radiate, preventing 
it forming small objects like planets. If it tried, it 
would get hot and the pressure would stop further 
collapse. But do galaxies contain large amounts of 
dark matter, or is it spread even more thinly? An- 
swers may be provided by galactic rotation curves 
(Figure 2). For circular orbits, the speed should be 
given approximately by 


v 2 (r ) GM b (< r ) 


Note M b means baryonic mass only, as one might 
at hrst expect. Only the mass at radii smaller than 
r contributes to the force because of the Shell The- 
orem. At large radii, this includes the bulk of the 
mass. Thus, M b hardly varies with r, making for a 
so-called Keplerian rotation curve. This drops to 
0 at large radii. It does not go flat at a non-zero 
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value. Real rotation curves, however, usually do. 

Taking a hint from Figure 1, most scientists be- 
lieve that adding dark matter can resolve the 
problem. The trick is to fudge the mass distribu- 
tion so that M(<r) oc r , even outside the bulk of 
the visible mass. The stars and gas are not distrib- 
uted like this, but the dark matter might be. This 
explanation seemed to resolve other problems 
too, the main one being that a self-gravitating disk 
(like a spiral galaxy) is unstable. Adding a huge 
halo of dark matter provides an additional restor- 
ing force. 

The dark matter and the baryons would be sub- 
ject to very different etTects. For example, radia- 
tion from supernovae can heat and eject large 
amounts of gas but little atfect dark matter. Gas 
can also be accreted from surrounding regions 
more easily than dark matter. Thus, the ratio of 
the two should be unlikely to remain fixed. Ob- 
serving the baryons would shed little light on how 
much dark matter there should be, just as meas- 
uring a stars properties cant tell us what sort of 
planets orbit it. Galaxies are perhaps even more 
complicated. Because of this, plus the fact that it 
is the dark matter that dominates their total mass, 
we wouldnt expect it to be possible to predict the 
rotation curve based on the observed baryons. 


The Baryonic to Dark Matter 
Ratio in Galaxies 

Despite all this complexity, however, the value of 
the velocity at which a rotation curve Aatlines 
(if it does so) can be predicted remarkably ac- 
curately based on the baryonic mass alone. This 
result is called the Baryonic Tully-Fisher Relation 
(Figure 3). This is surprising, considering that 
some galaxies have lost more than 95% of the bar- 
yons originally present (assuming a 1 : 5 ratio of 
baryonic : dark matter initially). This loss is of- 
ten due to supernovae - explosions when massive 
stars die. In a dwarf galaxy, only a few of them 
would be necessary to remove most of the bary- 
onic mass. Thus, loss of baryons must be a some- 
what random process. Certainly it seems feasible 
that two dwarf galaxies could have started simi- 
larly, with one losing 'only' 90% of its baryons and 
the other losing 95%. The latter would have nearly 
the same but only half the baryonic mass. Gal- 
axies can also accrete gas from their surroundings, 
with the amount depending on their environment 
and merger history (whether other galaxies col- 
lided with them). 

The ratio of baryonic to dark matter would there- 
fore seem unpredictable. But it should certainly 
be very small, especially in dwarf galaxies. Bary- 
ons should hardly matter. This, together with their 


O 



R (kpc) 

Figure2 Rotation speed as a tunction of radius in a disk-like galaxy.The dashed line is the prediction of Newtonian dynam- 
ics, based on observed baryonic mass. The solid line is the prediction of MOND. 
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Figure 3 This shows \ 4 , against baryonic mass for a large 
number of galaxies. Different colours indicate gas domi- 
nated (usually less massive) or star dominated, quite differ- 
ent types of galaxy. The MOND prediction is the thin red 
line, assuming o 0 =1.2x 10" 10 m/s 2 . 


complexity, would make them a poor guide to the 
total mass. Yet accounting for the baryons lets one 
predict the dynamics fairly straightforwardly and 
with very high accuracy. The most natural expla- 
nation is that baryons are all there is. 


below a threshold acceleration, a 0 . 


The governing equation for gravitational helds 
(the Modihed Poisson Equation) reads 


— )<?)= -4?r Gp. 
«0, 


9 is the gravitational held (force per unit mass) 
and [i is a function. For spherically symmetric 
situations, the end result is that 


i 1 



9 = 9n, 


where g n is the acceleration ifNewtonian gravity 
were to hold exactly. If we set /i(x) = l Vx, then 
we recover the usual Poisson equation (leading 
to Newtonian gravity). Although everyday expe- 
rience clearly shows that p must be 1 for every- 
day accelerations, there is no good reason why 
this should be the case for very low accelerations. 
In fact, Milgrom suggested that a 0 was a tiny 
1.2 x 10' 10 m/s 2 . 


The Modihcation comes from setting p(x) = x 
when x« 1. This way, we get g = (g n a 0 ) 112 . Putting 
in g n = GM/r 2 for a point mass, one sees that the 
force eventually becomes 

VGMdT 0 

g = -• 

r 

Note that the force due to a combination of mass- 
es is not the sum of the forces due to each consid- 
ered individually. Using g= v 2 /r to hnd the speed 
of particles (e.g. stars) on circular orbits, one ob- 
tains two important results: the orbital speed does 
not vary with r (by construction) and 


MOND 


Voo — V GMa 0 .' 


This elegant solution suffers a serious problem: 
observed baryons don t exert enough force. Or 
do they? If one is willing to modify Newtons 
laws, then masses may exert more gravity than 
he assumed. Or perhaps Newtons second law 
does not work, though we wont consider that 
here. Thinking along these lines, Mordehai 
Milgrom proposed in 1983 a theory known as 
Modihed Newtonian Dynamics (MOND). The 
essential thing is for gravity to behave as 1/r rath- 
er than 1/r 2 at large distances from a point mass, 
thereby leading to a flat rotation curve. Looking 
at the data, Milgrom realised that distance was 
not the crucial parameter, rather acceleration 
was. Newtonian gravity needed to break down 


Predictive Power 

Not only can MOND predict the value of V», it 
can also fit individual rotation curves in detail 
(Figure 4), including bumps and wiggles. These 
are due to similar features in the underlying mass 
distribution. 

Without MOND, however, it would be a remark- 
able coincidence if a bump in the baryonic mass 
density were matched by a similar (but much 
larger) bump in the dark matter density at the 
same position. After all, baryons can form clumpy 
structures because they can radiate and cool. Dark 
matter cannot radiate, so it cannot be expected to 
have such small-scale features. Moreover, it is in a 
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nearly spherical halo whereas the baryons are in a 
disc. So it is very dithcult to see how a theory like 
MOND that just boosts the gravity of the baryons 
by a fixed factor (depending on the acceleration) 
can ever match reality. 

There are dozens more galactic rotation curves 
well fitted by MOND. By and large, without add- 
ing arbitrary and very large amounts of dark mat- 
ter, it seems to predict rotation curves well (using 
ft(x)=x/{l+x)). However, as is inevitable, there are 
a small fraction of cases (about 10%) in which the 
observations are less accurate and the orbits of 
stars arent purely circular etc. 

Figure 5 summarises the results from rotation 
curve studies of nearly 100 galaxies. Amazingly, 
the baryonic mass distribution is sufficient to 
predict the acceleration. This is despite galaxies 
supposedly having lost most of their baryons in 
highly complicated processes and accreted vari- 
able amounts depending on environment, lead- 
ing inevitably to different relative proportions of 
baryonic and dark matter. This proportion should 
not be calculable based on observing the baryons 
alone. Yet Figure 5 essentially provides a formula 
for doing so, even when the true acceleration is 
tens of times larger than g n (see [1]). 

Further Evidence for MOND 

Galaxies may appear isolated, but they often in- 
teract. Standard simulations of this indicate gas 
and dark matter end up separated due to their 
very different initial distributions and physics. 
Gas is drawn into long tidal tails, which may later 
form into dwarf galaxies. Crucially, these galaxies 
should be devoid of dark matter. They are also too 
puny to accrete much of it. Therefore, their rota- 
tion curves should follow from applying Newtons 
laws to the observed baryons. Figure 6 shows the 
results of such an attempt. 

Mergers between spiral galaxies can easily de- 
stroy their disks, forming an elliptical galaxy. Yet 
disks are hardly rare: the majority of nearby heavy 
galaxies are rotating disks, with little evidence of 
disruption. One reason why mergers are believed 
to be common is because galaxies are supposedly 
surrounded by huge dark matter halos, making 
them easy targets. Another is that galaxies empty 
their surroundings fairly slowly, leading to colli- 
sions at late times. By then, a lot of the gas has 
been converted into stars. This means there is lit- 
tle gas drag to prevent disruption of the disk in a 
merger. 


On a larger scale, the distribution of galaxies is 
unusual in the standard picture. This involves 
starting with a slightly inhomogeneous mixture 
of dark and baryonic matter. Overdense regions 
have stronger gravity and become even denser, 
some eventually forming galaxies. The Local Void 
is a nearby large underdense region. Observing 
a portion of the Local Void (within 25MLy of 
Earth), one finds three large galaxies. Yet simula- 
tions suggest we should observe around 19. 

One possibility is that galaxies exert stronger 
gravity than in the model, letting them empty 
their surroundings faster and more thoroughly. 
This might also explain the prevalence of disks. 
MOND would indeed provide stronger gravity. 
It can also solve several other problems not dis- 
cussed here (see [2]). 

Condusion 

The Universe probably has large amounts of dark 
matter. It is tempting to use this to explain mo- 
tions within galaxies. But doing so leads to amaz- 
ing coincidences and major problems. Moreover, 
a theory that does not invent vast and arbitrary 
amounts of invisible matter actually performs 
better. Galaxies can and should be understood us- 
ing only actually observed mass. What you see is 
all there is. But using Newtonian dynamics will 
force you to invent dark matter. 

Real dark matter might well exist, but only on 
larger scales. Perhaps it resides in galaxy clusters, 
where it is needed to explain dynamics, even in 
MOND (the required ratio of baryonic to dark 
matter is, on average, the same as in Figure 1). 
This would allow it to slow down the expansion 
of the Universe without affecting internal galactic 
dynamics. If this were so, then General Relativity 
(which reduces to Newtonian dynamics in galax- 
ies) cannot be the whole story. Considering that 
this theory has never been directly tested at ac- 
celerations as low as a 0 , perhaps this is not very 
surprising. After all, no theory works everywhere. 
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Figure 4 Using the same value of a 0 , it is also possible to fit in detail the rotation curve of NGC1560, a dwarf galaxy. It is clear 
that the true rotation curve is best explained by multiplying the Newtonian rotation curve (without dark matter) by some 
factor which increases with r.This is precisely what MOND does. If instead we add a smooth halo of dark matter, the bump 
disappears. Note the MOND calculation is based on actually observed mass, unlike the dark matter curve (the total halo 

mass and size are adjusted to match the data best). 



Figure 5 The ratio between the true acceleration and that predicted by Newtonian dynamics from the baryonic mass, as a 
function of the latter. MOND requires a unique relation between the two, unexpected with dark matter. Quantum gravity 
effects should be important at ultralow accelerations (left of red line) because the energy density in the gravitational field 
is smaller than that in the vacuum due to guantum fluctuations (the zero point, or dark, energy). 
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Figure6 Rotation curves of 3 galaxies born from debris of a galactic close encounter.The dashed curves are predictions of 
Newtonian dynamics without dark matter, which should not be present in these galaxies. The solid curves are predictions 
of MOND. Both predictions and observations have errors shown.The inclination of the dwarfs to the plane of the sky is fixed 
at the most likely value, based on the orbital geometry of the interacting progenitor galaxies. 
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The Death of a 
Mathematician 

Dr Mario Livio 

SeniorAstrophysicist, Hubble Space Telescope Science Institute 


I n the morning hours of May 30,1832, a single 
shot hred from 25 paces hit Ĕvariste Galois in 
the stomach. Although fatally wounded, Ga- 
lois did not die on the spot. He remained lying on 
the ground until an anonymous good Samaritan, 
perhaps a former army offrcer, perhaps a peasant 
passing by, picked him up and brought him to the 
Cochin hospital in Paris. The following day, with 
his younger brother Alfred at his side, Galois 
died of peritonitis. His last known words were: 
"Don t cry, I need all my courage to die at twenty." 

This was the gloomy end of the most romantic 
of all mathematicians - a young man in whose 
mind the sweeping ideals of the French Revo- 
lution were inseparable from the revolutionary 
new branch of mathematics he had invented. 
Galois is the originator of Group Theory - the 
mathematical language that describes symmetry. 
Just as arithmetic has become the language of 
accounting, and colour and shape the language 
of abstract art - mathematicians, physicists, and 
even economists use group theory to explore the 
labyrinths of symmetry. 

You might have expected that every intimate de- 
tail in the life of such a prominent mathematician 
would be widely known. Yet, Galoiss death has 
remained veiled in mystery for almost two centu- 
ries. What is known is that Galois was killed in a 
duel with pistols on that fateful morning in 1832, 
but the questions of who killed him and why 
have been the subject of conspiracy theories ga- 
lore. Biographers have further been perplexed by 
the fact that the wounded young man appeared 


to have been abandoned in the field. 

Following three years of intensive research, I pro- 
posed in 2005 that the fog surrounding Galoiss 
mysterious death may have finally lifted. 

VariousTheories 

The known facts concerning Galoiss activities 
in the last week of his life are precious few. Even 
Galois s own three letters, indicating that "two pa- 
triots" (meaning active republicans) provoked the 
duel over "something so contemptible" involving 
an "infamous coquette," did not shed sufficient 
light on the identity of his opponents or their 
motives. The fact that Galois was a revolutionary 
firebrand inspired many of his early biographers 
to speculate that political enemies killed Galois. 
A few have allowed their imaginative plots to take 
off and include even more intrigue, suggesting 
that the "coquette" was in fact a police agent mas- 
querading as a prostitute. 

The first clue pointing to an unrequited love as the 
potential cause for the duel came from the work of 
an unlikely "detective" - a Uruguayan university 
professor. Using a magnifying glass and special 
lighting to examine Galoiss papers, Carlos Al- 
berto Infantozzi discovered in 1968 the identity of 
the "infamous coquette" - Stephanie Potterin du 
Motel. This young woman lived in a building that 
housed a convalescent home where the troubled 
Galois was placed on parole after being released 
from prison. Stephanie was certainly neither a 
prostitute nor a police provocateur. 
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Based on Infantozzis "forensic" work, as well as on 
an article from the period in the Lyon newspaper 
Le Prĕcurseur (reproduced by the French author 
Andre Dalmas in 1956), Dalmas and the Ameri- 
can physicist and author Tony Rothman have 
started to painstakingly put together the pieces of 
the puzzle. They suggested that the person who 
shot Galois was Stephanies presumed lover (and 
a personal friend of Galois), since they believed 
that the young seductress was playing a double 
game with the hearts of the two young men. 

This was more or less the accepted consensus 
until 1996, when a new biography by an Italian 
historian of mathematics turned the entire story 
on its ear. Laura Toti Rigatelli suggested that the 
famous duel wasnt even a duel at all! Rather, this 
somewhat Machiavellian theory proposed that 
Galois sacrihced himself for the Republican cause 
- the republicans needed a corpse to stir up rebel- 
lion, and he offered his. While many accepted Toti 
Rigatellis story, not all did. The French researcher 
and author Jean-Paul Auffray, who conducted an 
extensive study of documents related to Galois, 
concluded that the duel was real. AutLray reintro- 
duced the theory that the unfortunate love affair 
with Stephanie provoked the duel, and suggested 
that one of the opponents was none other than 
Stephanies father. 

I have always been fascinated by Galois. How can 
you not be? When you realize that this Aamboyant 
romantic brought about one of the greatest break- 
throughs in mathematics, and that he achieved 



this feat before the age of twenty! When I start- 
ed to research the life, and especially the death, 
of this visionary genius, I decided to embark on 
this task with no prejudices, and to leave no stone 
unturned. Having had the added advantage of 
being able to examine critically all the evidence 
collected by numerous researchers and their con- 
clusions, in three years I was able to develop what 
at least appears to be an entirely self-consistent 
picture. While the new theory clearly contains 
elements of previous scenarios, it combines these 
elements with new insights that give them, in my 
humble opinion, an enhanced credibility. I there- 
fore strongly believe that I have come closer to the 
truth than was ever possible before. 

My Condusion 

So, who killed Galois and why? A key point ig- 
nored by many biographers is that Galois always 
talked about two people who provoked the duel. 
One could, therefore, not expect to find a com- 
plete answer without an identification of both op- 
ponents. My conclusion is that these two people 
were Denis Faultrier and Ernest Duchatelet. The 
former was a close friend of Stephanies family, 
and he later married her widowed mother. The 
latter was Galoiss republican friend (and Steph- 
anies presumed lover), and it was he who shot 
the fatal bullet. The entire affair was a classical 
case of cherchez la femme. From Stephanies two 
devastating letters to Galois we learn that either 
by some careless words, or by too impetuous a 
behaviour, the inexperienced Ĕvariste offended 



Galois at 17, as drawn by his brother 
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Galois at 15, as drawn by a classmate 
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Galois's Birthplace All images: Municipality of Bour-la- 
Reine, through the assistance of Philippe Chaplain 


Stephanie, who immediately informed her so- 
called "hancĕ" (Duchatelet) and so-called "uncle" 
(Faultrier; the two descriptions were given by 
Galoiss cousin, Gabriel Demante). When the two 
men confronted Evariste, the hot-blooded young 
man added insult to injury, referring to the entire 
incident as a "miserable piece of gossip." At a time 
when invitations to duels were issued at the drop 
of a hat, this was more than enough for the two 
men to challenge Galois. A seventeen-year-old 
young woman who did not return his love sealed 
the fate of one of the most brilliant mathemati- 
cians to have ever lived. 

Why did Galois appear to have been left wounded 
on the ground by most, if not all, of the seconds? 
Galois' autopsy report describes a large bruise on 
his head that was probably caused when he fell. He 
might have been knocked unconscious and pre- 
sumed dead. One of the reports from the period 
notes that a "former officer" brought Galois to the 
hospital. This hts Denis Faultrier, a former captain 
in the national guard, and the second opponent in 
my scenario, like a glove. In the book "The Equa- 
tion that Couldnt Be Solved", I presented more 
details of what had led me to the proposed course 
of events. Can the two-centuries-old case hnally 
be closed? Hopefully, yes. But with a number of 
gaps in the hard evidence, uncertainties are likely 
to remain. What is certain is the fact that Ĕvariste 
Galois will always be remembered as one of the 
most creative individuals to have ever lived. The 
new branch of mathematics that he established 
has expanded far beyond the boundaries of pure 
mathematics, into the realms of physics, econom- 
ics, music, visual arts, and wherever symmetries 
can be found. 
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Triangular Wriggle, by Roberto Giardili 

The Triangular Wriggle won the prize for 'Most 
Effective Use of Mathematics' at the 2013 
Bridges Conference for Mathematics in Art. The 
sculpture is based on a Lindenmayer-system, 
a parallel rewriting system that can be used to 
generate fractals. 
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T he rise and development of the field of 
complex dynamics (pioneering research 
in this area has been carried out by A. 
Douady, P. Fatou , J. Hubbard , G. Julia , C. Mc- 
mullen , J. Milnor , W. Thurston , J. C. Yoccoz 
et al.) revealed a set with properties of such pe- 
culiarity and beauty that one cannot help but 
be amazed at it. It is perhaps the most famous 
example of how chaos, or to be even more pre- 
cise, infinite complexity, can be created by iter- 
ating a procedure as simple as z i-A z 2 + c. 


critical point of f(z) = z 2 + c (the point where its 
derivative vanishes). Since critical points govern 
the dynamics of/(see [4]), the use of the number 
0 makes sense. 

The iteration seems fairly simple, yet the resulting 
shape is certainly not: 

The hrst reasonable question to pose is what hap- 
pens if we restrict the domain to the real numbers. 


Delinition and Basic Result 

.Let f(z) =z 2 + c. The Mandelbrot set is dehned as: 

M = UeC\3 K eR + : |/ (n) (0)| < K , Vn G N) 

where/ (n) denotes the n-fold application of/ Put 
simply, M is the set of all complex numbers c for 
which the sequence 

(/ ( <0)), n = 1,2,... 

is bounded in modulus. Strictly, we are interested 
in the dynamical system 



(C, z^ z 2 + c) =F C 


Figure 1 The shape of the Mandelbrot set 


which is the real object we wish to understand. 
At hrst sight, the choice of the number 0 in the 
dehnition seems arbitrary. However, 0 is the only 


In this article we will focus our attention on real 
parameters. Our task is to fmd the set A = M n R, 
called the antenna. 
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Theorem M n R= [~2 y \] 

Proof We begin by stating a ratio test for se- 
quences, in the following form: 

Ratio test for sequences Let a n be a sequence of 
positive reals such that 

l im = A 

n^oo a n 

■ If A > 1 the sequence diverges (escapes to 
inhnity) 

■ If A < 1 the sequence converges. 

First of all, the sequence b n = / (n) (0) satishes 
b n+ 1 = b n + x and thus b n+l - b n =b n - b n + x. 

Lemma 1 A n (\, oo) = 0 

Proof For x > \, notice that b { > 0, i and 
b n+ 1 - b n > (b n - \) 2 , which in turn implies that our 
sequence is increasing. This implies that the limit 
lim b n 

n—>oo 

exists (but is not necessarily hnite). Hence 
lim n ^oo exists and is hnite. In fact, 


We demonstrate once more that, within the given 
interval, our sequence is positive and increasing. 
Note that for x < -2we have b 2 = x 2 + x > \x\ > 2. 
urthermore, b 2 - b\ = x 3 (x + 2) > 0. Now, observe 
that b n+l - b n = b n - b 2 n _ x and an easy induction 
shows the sequence is increasing. For every num- 
ber x in our interval, the same proof as in lemma 
1 implies that the limit 


lim (b n + = lim = A 

n—>oo v n ' n—>oo O n 

exists (though it may be +oo) and is equal to 


A = lim (b n + M > lim (2 + M > 1 

n—> oo n n—>oo n 


since b n > x. Again, by the ratio test for sequences, 
the claim follows. 


Lemma 4 [- 2 , 0 ] c A 

Proof If x = -2 then, b n (x) = 2 Vn > 2, so (-2) e A. 
If x e (-2, 0], then b n+1 = b n + x > b n - 2 > -2 
and the sequence is bounded below. Finally, 
\b 2 \ = \x 2 + x\ < |x|. By induction if \b k \ < \x\ then 
^k+l ~ b 2 k + x < x 2 + x < |x| < 2 and the sequence is 
also bounded above, which proves the lemma. □ 


lim = lim ( b n + 

n— > oo b r} n—>oo v ° n 7 


Lemmas 1-4 combined hnish the proof of the 
theorem. □ 


exists (but is not necessarily hnite). 

Now we make use of the inequality 
b n + ^ > 2a/T > 1 (AM-GM) and we conclude 
that 

lim > x 

n^oo b n 

so the sequence diverges for these values of x. 

Lemma 2 [0, \] c A 

Proof To show that \ e A we prove by induction 
that b n < \n. Indeed, the base case is trivially true: 
if b k < \, then 0 < b k + 1 = b k + \ < \. For the 
rest of the proof, observe that each part of the 
sequence is a polynomial in x with only positive 
coethcients and hence each term forms a strictly 
increasing function of x in (0, oo). Thus, since 
x e [0, |], we have b n = b n (x) < b n (\) < \ and the 
lemma follows. 

Let s now move on to the negative reals. 

Lemma 3 A n (-oo, -2) = 0 


Some Further Remarkable 
Properties 

Inhnite complexity No matter how much you 
zoom in on a point close to the boundary of the 
Mandelbrot set, there is going to be a new geo- 
metric shape revealed after any degree of magni- 
hcation! 

Preperiodic (Misiurewicz) points We call a se- 
quence (a n ) pre-periodic if and only if it becomes 
periodic after a hnite number of steps, that is if 
there exist M, T: a n+T = a n ,Wn> M. Accordingly, 
a point in the Mandelbrot set is called a Misiu- 
rewicz point if the resulting sequence from this 
point is pre-periodic. (If the reader is interested, 
he/she may search about post-critically hnite 
maps, Thurstons theorem or rigidity.) Misi- 
urewicz points are dense on the boundary of the 
Mandelbrot set and in fact, the Mandelbrot set is 
self-similar around such a point (its geometric 
image when zoomed in on that point resembles 
the initial shape of Figure 1). 
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Julia Sets (see [2]) Quadratic Julia sets are gener- 
ated by the mapping z i-A + c, for fixed c. The 
filled-in Julia set is the set of all complex numbers 
z for which the sequence defined by the itera- 
tions of the mapping does not approach infinity. 
The definition of a Julia set generalises to rational 
functions, but in the case in which we are interest- 
ed, it is a very nice property that the Mandelbrot 
set is the set of all points c for which the corre- 
sponding Julia set is connected. 

Relation to the logistic map There exists an 
amazing correspondence between points of the 
antenna of the Mandelbrot set (on the boundary 
of the so-called “Mandelbrot bulbs”) and the bi- 
furcation diagram of the logistic map, as seen in 
Figure 3. 

A Challenge: Open Problem 

A number c is called hyperbolic if the critical point 
of the map/= z 2 + c is attracted to a periodic cycle 
(i.e. the sequence b n converges to a periodic cycle 
in our case). Is the number c = -1.99999999 hy- 
perbolic (see[4] p.14)? The point is that no com- 
puter calculation is reliable and thus an answer to 
this question carries a high scientific interest. 

As an epilogue, it suthces to say that the field of 
research in fractals - chaotic dynamical systems - 
is extremely active and we know much less about 
it than we would like to! 
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Figure 2The Julia set 



Figure 3 The Mandelbrot set and the logistic map 
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Julia Set 

This picture is derived from the Julia set 
for the function exp(z 3 ) - 0.621. 




Geometry Through 
the Eyes of Physics 

Prof DavidTong 

Professor ofTheoretical Physics; DAMTP 


I ts no secret that there is a close connection 
between geometry and physics. Probably the 
most famous example is the theory of General 
Relativity, in which the force of gravity is recast 
in terms of the geometry of space and time. The 
purpose of this article, however, is not to wax po- 
etic about geometry in Nature. Instead, Id like to 
describe how things work the other way round, 
when Nature gets into geometry. I will try to ex- 
plain how we can use ideas from physics to give 
new insight into mathematics. 

To tell the story, we 11 need two simple ideas: one 
from maths and one from physics. From maths, 
the main character is a manijold. If you haven’t 
heard of this before, then you should have in the 
back of your mind a curved, closed surface, like 
that of a sphere or a torus. A manifold is a gener- 
alisation of this shape to higher dimensions. The 
purpose of geometry is to understand the prop- 
erties of different manifolds, the relationships be- 
tween them and the language we need to describe 
them. Meanwhile, from physics, the only object 
that weTl need to begin with is the humble parti- 
cle. Our plan is as follows: we 11 place the particle 
on the manifold and let it roam around. By under- 
standing the behaviour of the particle, we 11 try to 
infer various properties of the underlying space. 

To start, we 11 think about a particle obeying the 
laws of classical mechanics. Here there are few 
surprises and the particle does exactly what you 
would expect: it rolls around, guided by the con- 
tours of the space. The path it takes has some 
special mathematical properties and is called a 


geodesic. But the particle is too limited to know 
anything very deep about the underlying mani- 
fold. Its perspective is too parochial; it knows only 
about the small region in its immediate neigh- 
bourhood and has little to tell us about the global 
properties of the manifold. 



A Calabi-Yau manitold 

Andrew J Hanson, Indiana University 


Geometry and Quantum 
Mechanics 

Things get more interesting when we turn to 
quantum mechanics. In the quantum world, the 
particle no longer has a dehnite position. Instead, 
things are more uncertain and we have to talk in 
the language of probabilities. The mathematical 
description of a quantum particle is in terms of 
a wavefunction, f(x). This is a complex valued 
function, with x a set of coordinates which label 
points on the manifold. The probability of hnding 
a particle at the point x is proportional to \f(x)\ 2 . 
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The fact that the quantum particle spreads out in 
a wave of uncertainty gives it more power. It can 
feel its way all over the manifold. It knows about 
the global structure of the space. The state of the 
particle is described by the Schrodinger equation 

-vV = Ey (!) 

You’ve probably seen the symbol V 3 before. Its 
called the Laplacian. Roughly speaking, it means 
that you should differentiate f twice with respect 
to every coordinate that it depends upon. The first 
time you see the Laplacian is usually in the con- 
text of flat K 3 , where x = {x,y,z) and 


d 2 d 1 d 2 

— * + *—* + —* 

dx 2 dy dz 


There is an obvious generalisation of this to dif- 
ferent dimensions. But, most importantly, there 
is also a generalisation of the Laplacian to mani- 
folds that are curved. In this case, the Laplacian 
depends on the metric on the manifold which 
means that the symbol contains within it infor- 
mation about the distances between different 
points on the manifold. 


Tbe E in equation (1) is just a real number. Physi- 
cists would identify it with the energy of the parti- 
cle. Tbe key idea is that the Schrodinger equation 
doesnt admit solutions f(x) for any value of E. 
Instead, there are only solutions when the energy 
T 7 ^kes certain, discrete values. Moreover, because 
V 2 depends on the underlying space, so too does 
the list of allowed energies. This provides a very 
different way of thinking about geometry. You 
give me a manifold and specify its shape and cur- 
vature (or, more precisely, its topology and met- 
ric). With that information, I solve the Schroding- 
er equation and hand you back a list of numbers E. 
That list of numbers is called the spectrum of the 
Laplacian and it contains, encoded with it, much 
of the information about the manifold. This way 
of thinking is called spectral geometry. 

There is a more down-to-earth version of spectral 
geometry, made famous by the mathematician 
Mark Kac in an article called "Can One Hear the 
Shape of a Drum?". The frequencies at which a 
drum beats are again governed by the equation 
(1), now with particular boundary conditions im- 
posed by the shape of the rim of the drum. The 
question is: if you know all the frequencies, can 
you figure out the shape? The answer, it turns out, 
is no, but you can extract a lot of information. 
Similarly, it is known in geometry that the spec- 


trum is not necessarily sufficient to determine 
uniquely the underlying manifold. Nonetheless, 
the study of spectral geometry is a rich subject, 
with different properties of the manifold encoded 
in the spectrum in interesting ways. 


It will be useful to work through a (very) simple 
example of spectral geometry: the one-dimen- 
sional circle. We will label the position along 
the circle by the coordinate x. If the circle has 
radius R, we should identify x = x + 2nR . The 
Schrodinger equation now reads 


dx l 


- E\y. 


The solutions are simply y/ = e inxlR . The informa- 
tion that the space is a circle arises through the 
requirement that f is single valued, so that \f/{x) = 
f(x + 2nR). This tells us that we must have n e Z. 
The spectrum of the circle is therefore just a tower 
of numbers 

2 

E = ^r, 

R 

WeTl return to this shortly. 


Although I introduced spectral geometry by 
thinking about quantum physics, the subject 
wasnt discovered by physicists. Nonetheless, its 
pleasing that it sits so naturally in the framework 
of quantum mechanics and there are many fur- 
ther related connections between the two subjects. 
For example, a more complicated quantum me- 
chanical Hamiltonian which has a property called 
supersymmetry naturally captures the de Rahm or 
Dolbeault cohomology of the manifold. In this 
way, many of the great results from differential ge- 
ometry can be recast in the language of quantum 
mechanics. However, rather than exploring these 
directions here, I would instead like to tell you 
about something novel and surprising that came 
out of thinking about geometry in the language 
of physics. 

Geometry and String Theory 

String theory is currently the best guess that we 
have for a unified theory of gravity and quantum 
mechanics. The basic idea is, on the face of it, 
slightly daft: string theory postulates that, at the 
fundamental level, if you look deep inside every 
particle, you will see a tiny vibrating loop of string. 
At the moment there is no experimental evidence 
for string theory. Nonetheless, it is a powerful 
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mathematical framework. Here were going to 
bring that framework to bear on questions in ge- 
ometry. We use the same strategy that weve seen 
above and ask: what is the energy spectrum of a 
string moving on a manifold? 

Lets return to our example of the circle. Now 
there are two different things that the string can 
do. First, the string can form a little loop which 
then moves around the circle. Because, from afar, 
this loop of string looks like a particle, it shouldnt 
be too much of a surprise to learn that the en- 
ergy spectrum is identical to that of a particle: 
E = n 2 /R 2 with n e Z. But the string can also do 
something that the particle cant: it can stretch 
itself all around the circle. You can think of the 
string as an elastic band; stretching it costs energy 
and a string which winds m times around the 
circle has energy E = ( 2nmR ) 2 , with m e This 
means that the energy spectrum of a string mov- 
ing on a circle consists of two towers of numbers 
% 

E = + 4x 2 m 2 R 2 , 

R 

But theres something interesting here. This set of 
numbers remains the same if we swap 
t 

R< - > -. (2) 

2kR 

This means that, if all youre given is this list of 
numbers, then you cant tell the difference be- 
tween very big circles of size R and very small cir- 
cles of size l/2nR. As far as the string is concerned, 
these circles look exactly the same! Of course, 
weve only discussed the energy spectrum of the 
string but it turns out that all properties of the 
string remain invariant under the interchange (2). 
Strings really cant tell the difference between big 
circles and small circles. This beautiful fact has a 
rubbish name: it is called T-duality. 

The confusion of strings extends to other mani- 
folds as well. Roughly speaking, manifolds come 
in pairs. Although particles view these pairs very 
differently, to a string they look identical. (This is 
literally true of a special class of manifolds called 
Calabi-Yau and there is a slightly generalised ver- 
sion of the statement for other manifolds). But 
these two manifolds are not related in a simple 
way like the big and small circles. Instead, at hrst 
sight, the two manifolds seem to have nothing to 
do with each other. Typically, they don t even have 
the same topology (i.e. the same number of holes). 



How strings can behave 

Steuard Jensen, Alma College 


This pairing between manifolds is called mirror 
symmetry. The strings inability to distinguish be- 
tween these two manifolds turns out to be a great 
strength. For a start, we learn that theres a very 
surprising and unexpected relationship between 
manifolds. Moreover, it turns out that mathema- 
ticians were often able to say a lot about one of 
these manifolds, but almost nothing about the 
other. Yet, according to string theory, the two 
manifolds should be identical; you just have to 
look at them in the right way. Any question that 
you can answer about the hrst manifold is telling 
you something interesting about the other. (Tech- 
nically, questions in complex geometry for the hrst 
manifold are turned into questions in symplectic 
geometry for the second). Mirror symmetry then 
becomes a powerful tool which allows you to re- 
interpret properties of one manifold to provide 
answers to previously unsolved questions about 
the other. 

Mirror symmetry was discovered almost 25 years 
ago. In the intervening time, it has become one 
of the most vibrant areas of research in geometry, 
with insight coming from both mathematicians 
and physicists. There is, admittedly, a ditTerence 
in the style of research. Physicists tend not to be 
overly consumed with matters of rigour, relying 
instead on an intuition for how Nature should 
work to build conjecture upon conjecture. Math- 
ematicians, of course, are not content until each 
conjecture becomes a proof. Yet this is one of an 
increasing number of areas in which mathemati- 
cians and physicists hnd themselves exploring the 
same questions hand in hand. It is a relationship 
which has enriched both communities. 
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Volvox 

Volvox is a genus of chlorophytes, a type 
of green algae. It forms spherical colonies 
of up to 50000 cells. The colonies contain 
eyespots, allowing them to swim towards 
light. 
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B uilding an igloo, or dome in general, is 
a task humanity has faced since antiquity. 
The chord lengths of geodesic domes were 
considered classihed military information in the 
United States until the sixties, and some believe 
that the secrets of medieval cathedral dome build- 
ers form the origins of Freemasonry. Even now, 
the construction is not an easy procedure. 

The Inuit are known for their ability to build snow 
domes. They build layers of bricks in a spiral pat- 
tern, causing the dome to close in loxodromically 
(see Figure 1). Due to the multitude of different 
brick shapes, this method is rather difhcult for the 
amateur to carry out. 


pose additional requirements on our block forms, 
with the hrst two considered essential, and the fi- 
nal two ideal: 

1. We want to use as few different shapes as pos- 
sible, ideally just one. 

2. The volume and dimensions of the shapes 
should be small fractions of the total dome vol- 
ume and radius. 

3. The shapes should be roughly polyhedral. 

4. The building procedure should be described 
by a simple algorithm. 



Figure 1 The Inuit method for build- 
ing an igloo 

Mathematical Formulation 

In developing an easier process for igloo build- 
ing, we are interested the following question: is it 
possible to split the spherical dome into identical 
elements ? The answer is yes, of course. For exam- 
ple, we can cut the dome into n slices using lines 
of longitude, forming spherical triangles with two 
right angles at the base. Such a form would not be 
very useful for our purposes though. We must im- 


Very similar requirements are found in many ar- 
eas of science, for instance in the construction of 
grids on spheres in climate research, and in foot- 
ball construction. 

It is well known that if three positive inte- 
gers p, q y r satisfy 1/p + 1 /q + 1/r > 1, then the 
spherical triangle with angles A = n/p, B = n/q , 
C = n/r provides a non-overlapping tiling of 
the sphere. Since the area of each triangle is 
S = n(l/p + 1 /q + 1 /r - 1), the half-sphere is divid- 
ed into 2 n/S segments. The smallest possible such 
triangle has p = 2, q = 3, r = 5. Itisa right angled 
triangle, which splits the half-sphere into 60 tiles. 
30 of them are 'left-handed', and the remaining 30 
are the mirrored counterparts of these. 

Given a tessellation of the sphere, it is conceptu- 
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Figure 2 Conversion from spherical triangles into polyhedral 
dome elements (left) and the net of the dome elements (right) 

ally very easy to split a spherical dome of given 
thickness. We draw aligned spherical triangles 
(polygons in general) on the inner and outer 
sphere, and connect them by straight line seg- 
1 ments (see Figure 2). 

The Construction Procedure 

Combining the above gives us a method for con- 
structing an igloo. To start the dome, we begin 
with two concentric circles (see Figure 3a). To 
initialise construction, we hrst place 12 segments 
in a non-trivial order (Figure 3b). Note that three 
elements of the same orientation are placed next 
to each other, on three different triangle sides. The 
hrst row has point rehection symmetry with re- 
spect to the centre of the circles. Further blocks 
are simply rehections of those already placed 
(Figures 2c and 2d). The most dithcult opera- 
tion is the placement of the hnal four elements 


(Figures 2e and 2f), which should ideally all be 
placed at once. 

Paper, gypsum, wet snow and ice bricks have been 
used to test this procedure on small scales. The 
igloo has some tendency to come apart under its 
own weight, so a band around the base must be 
used. 

Condusion 

The '2, 3, 5' spherical triangle above provides a 
working solution to the igloo building problem, 
requiring only two different brick forms (the 
two orientations). Another interesting solution 
is based on geodesic domes (two different equi- 
lateral triangles, 90 bricks). It is still not known 
whether any single small block type is suthcient 
to tile the hemispherical dome. Possible search 
areas are exceptional spherical tilings, and nearly 
spherical polyhedrons similar to the deltoidal 
icositetrahedron. 
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Figure 3 The construction of the igloo, left-handed and right-handed blocks coloured red and blue respectively 
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500 Years of Mathem 


1713: 300 years have now passed 
since Jacob Bernoulli's book 
Ars conjectandi (The Art of Con- 
jecture). It proves to be an ex- 
tremely important work on 
probability. It contains the first dis- 
cussion of the Bernoulli numbers. 


g.lCO&l BERNDUE.n. 
ARS COKJECTAKUH 

DE SER1ERU5 ISFIM TPy, 



1913:100 years ago this year, Hardy 
received the now infamous letter 
from Ramanujan, and the prolihc 
mathematician Paul Erdos was born. 
In addition, Bohr's first quantum 
model of the atom was written down. 



1738: The 275th anniversary of the 
publishing of Daniel Bernoulli's Hy- 
drodynamica has now passed. It gave 
for the first time the correct analysis 
of water flowing from a hole in a 
container, whilst also providing the 
basis of the kinetic theory of gases. 









atical Anniversaries 

1 1963: 2013 sees the 50th anniver- I 2013: This year we witnessed 
sary of Paul Cohen demonstrating the Nobel Prize for Physics being 
that neither the continuum hypoth- ^Bawarded to Higgs for his work on 
esis northe axiom of choice can be Hthe Higgs Boson. Additionally, the 
proven from the standard axioms ^Bwhole of 2013 was designated the 
of set theory. Furthermore, Edward |g| International Year of the Statistic. 
Norton Lorenz published solu- 
tions for a simplihed mathematical 
model of atmospheric turbulence 
- generally known as the Lorenz 
Attractor or the Butterfly Effect. 



1938 


1988 


I 




1963 


1938: 75 years since, Kolmogo- 
rov publishes Analytic Methods 
in Probability Theory which lays 
the foundations of the theory 
of Markov random processes. 



2003: 10 years ago, Grigori Perel- 
man proved the Poincarĕ conjec- 
ture. Perelman was later offered 
a Field's Medal for his proof of this 
Millennium Maths Problem. How- 
ever, he chose to decline the award. 
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O ne of the most popular BBC TV pro- 
grams in the UK, which also spawned a 
popular board game, is called Pointless. 
In this game, a team is given a category which 
consists of several items (e.g. "What are the Cana- 
dian Provinces?"), and is asked to name one item. 
If it is correct, they are awarded points, the num- 
ber of which decreases with the "obscurity" of the 
item; if it is incorrect they score a large number of 
points. Each teams aim is to minimise their total 
score. The level of obscurity is determined by a 
questionnaire administered to audience members 
ahead of time, asking to list all items belonging 
to this category that they know. Items which are 
listed by x% of the audience are worth v points 
(or a function thereof) for the team selecting this 
(single) item later. 

The game is therefore based on the trade-off be- 
tween selecting a high probability item, with a 
(likely) high score, and selecting a low probabil- 
ity item, with a (likely) low score. Of course, the 
teams do not know exactly how well known vari- 
ous items are. The game is related to other proba- 
bility-selecting games of the type discussed in [1]. 

A Risk-Neutral Team 

We shall formulate the problem as maximization 
(rather than the minimisation format of Point- 
less). Let c be the penalty for a wrong answer. Let 
b be the highest number of points possible (cor- 


responding to an item which no-one in the au- 
dience listed). Let x be the subjective probability 
that an item is correct. It becomes the expected 
number of audience members listing the item. 

The expected number of points earned by select- 
ing an item with subjective probability x of being 
correct is 

P{x) S Ey(x(b — x a Y)) + (1 — x){— c). . 

The term x a , a > 0, embodies the contests rule 
about how more and more popular choices will be 
penalized - for instance, Pointless has a = 1. Tis a 
multiplicative noise term with E(V) = rehect- 
ing the variation in the audience popularity of an 
item with its subjective probability. Then 

P{x) = —c + {b — c)x — fJLX a+l . 

Note that only the mean of Y matters here. Ele- 
mentary calculus shows that 

_ r 6+c 1 1/a 

X ~ L/x(a+1) J 

where is the optimal subjective probability item 
to choose in this model. 

In order to have P{x*) > 0, so that playing opti- 
mally is benehcial (playing not optimally might 
not be), we need 

r b+c ll /a ^ r ( a +l) c i 1/a 
lfi(a+ 1)J ^ \-a(b+c)\ 
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(clearly holds for small c). If c = b, we need b > \i. 

We see that increases in c (a higher error pen- 
alty induces more caution, which makes sense). It 
is increasing in b and decreasing in p. 

Sequential Guessing 

If, on the other hand, the teams guess sequentially, 
and the hrst teams attained v points, the second 
(us) wishes to maximise the probability of attain- 
ing more than v points. 

This is equal to 

P y[~c + (b — c)x — y*x a+1 > v\ 

T7 \{b+c)x-c-vi 

— py [——J 

Elementary calculus gives 

-** _ (o+l)(c+ 1 >) 
a{b+c) , 

where is the optimal subjective item probabil- 
ity to choose in this model. 

Thus iff 

a(b+c)_ r_6+c_l l/o _ „ 

V > a+1 \-n(a+ 1)J C . 


An Alternative Approach 

Suppose now that the team selects an item whose 
audience popularity they estimate as v. 

If the item is correct, they will obtain a prize with 
a utility of u(l - v); otherwise they will obtain -c. 
For such an item, let the probability of being cor- 
rect be denoted by p{v). We assume that p\ u > 
0 and p", u < 0. That implies that the hmction 
u{ 1 - v) is concave in v. So our problem is to hnd 

max v \p(v)u(l mv) + (1 — p(v))(-c)\. 

A comparison with the previous model reveals 
that the current model is more general, allowing 
u( 1 - v) to be an arbitrary function, and the prob- 
abilityp(v) to be general. 

References and Notes 
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W hat does it mean to be computable? A 
function is computable if for a given 
input its output can be calculated by a 
bnite mechanical procedure. But can we pin this 
idea down with rigorous mathematics? 


Computability theory wasn t going to get very far 
if these functions werent computable. Next, we 
have two operations for constructing new func- 
tions from old: composition and primitive recur- 
sion. 


In 1928, David Hilbert (see [4]) proposed his fa- 
mous Entscheidungsproblem, which asks if there 
is a general procedure for showing that a state- 
ment is provable from a given set of axioms. To 
solve this problem mathematicians hrst needed to 
dehne what it meant to be computable. The hrst at- 
tempt was through primitive recursive functions 
and was a combined effort by many researchers, 
including Kurt Godel, Alonzo Church, Stephen 
Kleene, Wilhelm Ackermann, John Rosser, and 
Rozsa Pĕter. 

Recursive Functions 

Primitive recursive functions are dehned as a re- 
cursive type, starting with a few functions that 
we assume are computable, called founders, and 
operators that construct new functions from the 
founders, called constructors. The founders are 
the following three functions: 

The constant zero tunction a function that al- 
ways returns zero. 

The successor tunction S(n) = n + 1. 

The projection tunction proj n m is an m-ary 
function that returns the n th argument 


Composition Given a primitive recur- 
sive m-ary function h and m n- ary functions 
g„ .. „g m , the function/(x) = WgiM, .. ., g m (x)) is 
primitive recursive. 

Primitive Recursion Given primitive recursive 
functions g and h, the function 
f(x, 0) = g(x), f(x, y + 1) = h(x, y,f(x, y)) is primi- 
tive recursive. 

The set of primitive recursive functions is the 
set of functions constructed from our three ini- 
tial functions and closed under composition and 
primitive recursion. Many familiar functions are 
primitive recursive: addition, multiplication, ex- 
ponentiation, primes, max, min, and the loga- 
rithm hmction all ht the bill. 

So are we done? Is every computable function 
also primitive recursive? Sadly, no: the Acker- 
mann function (A(m, n) below) would be proven 
in 1928 to be a counterexample. 

! n + l if m = 0 

A(m — 1 , 1 ) if m > 0 and n = 0 

A(m — 1, A(m, n — 1)) if m > 0 and n > 0 

The Ackermann function is a total (dehned for all 
inputs) function that is clearly computable but not 
primitive recursive. Indeed, in 1928 Ackermann 
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(see [1]) showed that his function bounds every 
primitive recursive function - it grows too fast to 
be primitive recursive. 

Something was clearly wrong, but early comput- 
ability theorists didnt want to abandon primitive 
recursive functions entirely. What came next was 
a rather surprising idea at the time: perhaps com- 
putable functions need not be total! This was the 
key that unlocked computability theory: focusing 
on partial functions, those that may not be de- 
hned on all possible inputs. 

The reason for focusing on partial functions is to 
allow an unbounded search operator. That is, we 
want to be able to search for the least input value 
that satishes a condition and simply be undehned 
if no such input value exists. This operation is cap- 
tured by Kleenes ^-operator. 

ju-operator f(x) = (py)(g(x, y) = 0) returns the 
least y such that g(x,y) = 0 and is undehned if no 
such y exists. The function g(x, y) must be de- 
fmed for all y < y. 

Taking the closure of the [i -operator with all 
primitive recursive functions gives a class of 
^-recursive functions. In 1943, Kleene (see [5]) 
used his /^-operator to provide an alternative, but 
equivalent, dehnition of general recursive func- 
tions. The original dehnition was given by Godel 
in 1934 (see [3]), based on an observation by 
Jacques Herbrand. It would later be shown that 
//-recursive functions are the exact same class of 
functions dehned by two competing approaches 
(see [6]). 

A-Calculus 

Simultaneously, from 1931-1934, Church and 
Kleene were developing X-calculus as an ap- 
proach to computable functions. The syntax of 
\-calculus dehnes certain expressions as valid 
statements, which are called X-terms. A \-term 
is built up from a collection of variables and two 
operators: abstraction and application. 

Let s start with a collection of variables x, y, z, . . . 
and suppose M, N are valid A-terms. The abstrac- 
tion operator creates the term A x.M, which is a 
function taking an argument x and returning M 
with each occurrence of x replaced with the argu- 
ment. The application operator creates the term 
M N, which represents the application of a func- 


tion M on input N. 

The A-term A x.M represents a function f(x) = M 
and - like recursive hmctions - many familiar 
functions are A-definable. The a-conversion and 
/J-reduction are classic examples of reductions , 
which describe how A-terms are evaluated. An 
a-conversion captures the notion that the name of 
an argument is usually immaterial. For instance 
Xx.x and A y.y both represent the identity func- 
tion and are a-equivalent. A /J-reduction applies 
a function to its arguments. Take, as an example, 
the A-term (Xx.x)y, which represents the identity 
function (Xx.x) applied to the inputy. Substituting 
the argument y for the parameter x, the result of 
the function is y. So we say (Xx.x)y /J-reduces to y. 

In 1934 Church proposed that the term "effectively 
calculable" be identified with A-definable. While 
Churchs formalization of computability would 
later be shown to be equivalent to Turings, Godel 
was dissatished with Churchs work. In fairness, 
Godel was also dissatisfied with his own work! 
Church would go on to advocate that "effectively 
calculable" should be identified with general re- 
cursive functions (which Godel still rejected). In 
1936 Church (see [2]) published his workproving 
that the Entscheidungsproblem was undecidable: 
there is no general procedure for determining if a 
statement is provable from a given set of axioms. 

Turing Machines 

Meanwhile, after hearing about Hilberts Entsc- 
heidungsproblem, a 22 year old Cambridge stu- 
dent named Alan Turing began working on his 
own solution to the problem. Turing was unaware 
of Churchs work at the time, so his approach 
wasnt inAuenced by A-expressions (this wasnt the 
first time Turing failed to perform a literature re- 
view). Instead, he envisioned an idealized human 
agent performing a computation, which he called 
a "computer". To avoid confusion with the mod- 
ern definition of computer, weTl adopt the ter- 
minology of Robin Gandy and Wilfried Sieg and 
use the term "computor" to refer to an idealized 
human agent. The computor had infinite available 
memory called a tape, essentially an infinite strip 
of paper, that was divided into squares. The com- 
putor could read and write to a square, as well as 
move from one square to another. 

Turing put several conditions on the computation 
that the computor could perform. The computor 
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could only have hnitely many states (of mind) and 
the tape could only hold symbols from a hnite al- 
phabet. Only a hnite number of squares could be 
observed at a time and the computor could only 
move to a new square that was at most some hnite 
distance away from an observed square. He also 
required that any operation must depend only on 
the current state and the observed symbols, and 
that there was at most one operation that could 
be performed per action (his machines were de- 
terministic). 

From this, Turing would go on to dehne his auto- 
matic machines - which would later come to be 
known as Turing machines - and show the equiv- 
alence of the two formalisations. Hed then show 
that "effectively calculable" implied computable 
by his idealized human agent, which in turn im- 
plied computable by such a machine. Turing then 
went on to show that the Entscheidungsproblem 
was undecidable. Shortly before publishing his 
work, he learned that Church had already shown 
that the Entscheidungsproblem was undecidable 
using A-calculus. Turing quickly submitted his 
work in 1936 (see [7]) - six months after Church - 
along with a proof demonstrating the equivalence 
between his machines and A-calculus. 

After reading Turings seminal paper, Godel was 
hnally convinced that the correct notion of com- 
putability had been determined. It would later be 
shown that all three formalisations - Turing ma- 
chines, ^-recursion, and A-calculus - actually de- 
fine the same class of functions. That these three 
approaches all yielded the same class of functions 
suggested that mathematicians had captured the 
correct notion of computation, and supported 
what would come to be known as the Church- 
Turing Thesis. 

Three years later, in 1939, Turing completed 
his PhD at Princeton under the supervision of 
Church. In his thesis hed state the following (see 
[8]): "We shall use the expression 'computable 
function' to mean a function calculable by a ma- 
chine, and let 'efiectively calculable' refer to the 
intuitive idea without particular identification 
with any one of these definitions." 

Church-Turing Thesis Every effectively calcula- 
ble function is a computable function. 

Church intended for his original thesis to be taken 
as a definition of what is computable. Likewise, 


even though he never stated it, Turing had the 
same intention. In fact, the term "Churchs Thesis" 
was coined by Kleene many years after Church 
had published his work. These days, many peo- 
ple take the Church-Turing Thesis as a definition 
of what is computable; less formally stating that 
a function is computable if and only if it can be 
computed by a Turing machine. 

Its important to stress that the Church-Turing 
Thesis is not a definition as many believe. It does 
not refer to any particular formalization that 
weve discussed and is not a statement that can be 
formally proven. It is a statement about the nature 
of computation. Everything that is "effectively 
calculable", in the vague and intuitive sense, is a 
computable function. 
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CAPITAL PARTNERS 





a — k 


Identiiy (1) for n = 5: 

I * 252 + 2 ■ 70 + 6'20 + 20 - 6 + 70 ■ 2 + 252 * 1 = 4 5 

Lemma for N = 5, a = 9 and m = 11: 

1 - 364 - 4 ■ 66 + 3’ 10 + 01+00 + 00 = 

— 0 + 0 + 1+ 9 + 36 + 84 
Coroilary for n = 5, ai = 1 &nd ot 2 = 2: 

I ■ 792 + :i * 210 + 10 * 56 + 35 * 15 + 126*4 + 462 ■ I = 

= 1 + 14 + 91 + 364 + 1001 + 2002 

Identity (3) for n = 5 and a = —3: 

1 ■ 1287 - 1 * 330 + 0 ■ 84 + 1 ■ 21 + 5 ■ 5 + 21 ■ 1 = 4 5 

upper negation' .. 


Figure4The extended PascaTs triangle at integer gridpoints with some identities illustrated 



Figure 5The underlying surface defined by {(/3 ,y, $)\ /? jS y<^-<^+1 =0}.The series of$p Q (y) converges on the light part and 
it is given by the intersection of the /3 = /3 0 plane with the surface. 
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A Binomial Identity 

David Szabo 

MSc Mathematics Student ; Eotvos Lorand University 


I n 2009 I was given the following identity (cf. 
Fig. 4) as an exercise in a class. 

g( 2 ‘)( 2 :? t >- <» 

Throughout in this paper n > 0 is an integer. This 
paper summarises my investigation of this iden- 
tity of that time. In four independent sections we 
will see several solutions and generalisations. The 
developed ideas will lead to a generalisation of 
the binomial theorem. I advise the reader to try to 
prove (1) before reading on. 

Extending PascarsTriangle 

Considering the Taylor expansion of (l+x) a 
about x = 0, it is sensible to generalise the bino- 
mial coefficients as (£) := rr - *) for a e M, 

k e Z> 0 , which is a polynomial in a (the empty 
product and 0! is 1). With this new notation the 
above Taylor series gives the binomial theorem 
(1 + x) a = Y^kLo (k) xk f° r \ x \ < These coef- 
hcients indeed extend the combinatorial dehni- 
tion for 0 < k < a all integers, and they still satisfy 
(fc+i) = (k) + (fc+i) (Pascals recursion ). We ex- 
tend the dehnition further for k e Z <0 using this 
recursion formula (cf. Fig. 4). After short calcula- 
tions we see that: 

(^) = 0 for a G Z, 0 < a < k, 

(l) =0 for k < 0, 

(V) = (~h k for k > 0, 

(D = (—l) t '( , =' ; g 1 ) the upper negation. 

Combinatorial Solutions 

Consider the directed graph P (cf. Fig. 1) with 
vertices edges [][]-[+] and KHO for 
all n, k € Z, 0 < k < n. Say vertex [ k ] is in row n and 
column k; call \ 2 "] a central vertex for any n. 


For a directed path Y in P with start vertex from 
row s and end vertex from row e we denote its part 
restricted to rows between and including s’ and e 
by r|< for s < s < e < e. 


We denote by {S K E) the set of directed paths 
with start vertex from the set S and end ver- 
tex from E satisfying some optional condition 


col. 0 


row 0 * XK 1 

row 1 * ' C0L2 


row 2 » 3 

row3 _ A ZK AiKALKj ✓Gk a 
KiK KsK KaK K4K 


Figure 1 The infinite directed graph P 


O. Here S and E will be of the form {[jj*]: ki R k} 
for some binary relation R (such as < or >), and 
we abbreviate this set as [f k ] and omit writing R 
when it is ‘=’ Finally let [*] = [> n 0 ], the whole of 
row n. 


Solution I Note that #{[ 0 ] -> [ k ]}= ( k )- h is natu- 
ral to think of 4 n as 2 2n = J2l =0 Ck) * re - ^ n = 
#{ [§] ^ [ 2 * ]}• Consider an arbitrary T e {[ 0 ] [ 2 * ]}. 

Theterm ( 2 k ) suggeststhatin (1) theset {[ 0 ] [ 2 * ]} 

is counted in two different ways, with one count- 
ing conditioning on a (unique) k such that [ 2 k ] is 
in T, it is natural to choose the biggest such k. For- 
mally, let condition C mean that the path contains 
a non-start central vertex. Then T has a unique 
vertex [ 2 k ] for some 0 < k < n such that r| 2 ^ does 
not satisfy C (written as -<C), [i.e. T meets the cen- 
tral vertices at [ 2 k ] for the last time]. This is well- 
dehned as vertex [ 0 ] is in T. This gives 



which is the combinatorial meaning of (1). 
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To show this we claim that 


# 


'2 k 
k 


2 n 
* 


/ 2 n - 2 k\ 
V n-k )' 


( 2 ) 


To prove this claim, assign the number 
/(i,i):=±#{[-]^[^]} to vertex [*«] for 
0 < j < i < 2n - 2k [i.e. for the yertices reachable 
from [ 2 f\ up to row 2n\ where the sign is + for 
j < H2 [ i.e. for yertices on the left ofthe central ver- 
tices ], and it is - for j > //2, cf. Fig. 2. 


We see the boundary conditions/(/, 0) = 1 for any 
/, and /(/ j) = -1 for j # 0. Also/(/+l, j+ 1) = 
/(/, j) +/(/, j +1) for any / and j because this recur- 
sion is inherited from the paths for j+1 =f= (/+l)/2, 
and for j+l = (/+l)/2 it follows from symmetry 
and from the choice of the sign in / Notice that 
this recursion with the boundary condition deter- 
mines / uniquely. The binomial coefhcients (and 
arbitrary linear combination of their shifted cop- 
ies) satisfy this recursion, and it is easy to com- 
bine them to match the boundary conditions, the 
solution is /(/, j) = ( l f) - (/}), cf. Fig. 2. lience 
the LHS of (2) is 


#< 


(\2k\ [2nl 1 N 1 

^ L J L J J j —o 


which is a telescopic sum where all terms but two 
copies of ( 2n “^ _1 ) cancel, so it is indeed ( 2r fZ 2 k k ) 
as claimed. □ 




rnr ]f -ro 

/ s. 

rrn. o -i n -1-1.1 

/ \ / \ / \ 

I 1 1 lf a 1 H 1 -1 •“][ '1 -l] 

■/ s. / \ \ / \ 

11 1 lf a 2 -i]f 3 0 -d( ■ -2-3)1 -1 F| 

^ s, ^ s. 

1 IhT 7 »!»? T][io o-Mf*-5- -«II 1 -4-®ir -l-i] 

/ s, / s, / s, V Sr / v v / v 


Figure 2 Smaller numbers on the sides are constant mul- 
tiples of PascaTs triangle so that their sum is f(i,j% the value 
in the centre. 


The idea of this solution can be interpreted more 
elegantly purely combinatorially. 


Solution II We will hnish Solution I by showing 


#{R 




r]}, from which (2) 


follows. We will do so by exhibiting a bijection 


V- 


J 2k c 2n 1 J 2k 2n 1 

\ k * J * \ k # n J 


between the complements in {[ 2k \ -> [ 2 ” 
that C is absent in the image. 


}. Note 


For k = n both the domain and the image are 0. 
Else any path from the domain has a second ver- 
tex which is either [ 2k k l \ (call this condition C) 
or [ 2 k )l\ (condition 1Z). We also distinguish the 
cases in which the end vertex is in [>”] or in [<"]. 
We will dehne (p separately on the resulting four 
partitions of the domain. 



Figure 3 cp illustrated on the first two partitions 


Pick rr 1 {[ 2 k \ (CA)£ > [> n ]} from the hrst parti- 
tion. Now C is automatically satished as the sec- 
ond and end vertex are on different sides of the 
central vertices (in all such cases we put C into 
brackets). Dehne ^(Ih) e { [ 2 k \ [> n ]} by let- 

ting be image of T^ 1 under [}] -> [f +1 \ 

[i.e. we shift T^ to the right by 1 but keep its start 
vertexfixed to get (piTJ, cf. Fig. 3]. (p on this par- 
tition is a bijection. Pick T 2 e { [ 2 f\ [ 2n ]} 

from the second partition. C holds, so there is 
a maximum l =f= k such that [f] is in T 2 . Dehne 
v( r 2) 6 {Tk} [<n]}by letting <p(r 2 )||f be 

the image of r 2 |}{ under [}]-+/;] and <p(T 2 )\? n 
= r 2 | 2 f [i.e. we get (p(T 2 ) from T 2 by reflecting its 
initial segment ending at [ 2 ] to the line ofthe cen- 
tral vertices, cf. Fig. 3]. Note that T 2 satishes 1Z (as 
/ =f= k) and C automatically, so (p on this partition 
is also a bijection from the unique choice of l. 


So (p on these two partitions is a bijection from 
{[?] ^ m) to {[?] V [-]}. Analogously 
dehning (p on the other two partitions gives the 
same result but with C and 1Z swapped. Thus (p is 
indeed as claimed. □ 

Algebraic Solutions 

Solution III Note that in (2«)! we can separate 
even and odd numbers, hence observe that 
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Using this observation we see that (1) is a special 
case of the identity ELo (“*)) L-J = ( ai n“ 2 ) (for 
a x , aeR) when a x = a 2 = - 1/2. We prove this iden- 
tity in 3 steps. 


weighted sum of these resulting Pascals coeffi- 
cients as follows (cf. Fig. 4). 

Lemma For a e M, m e Z and Ne Z> 0 


For a lt a 2 eZ> n , it is just ELo#{[o] [T]}. 

#{[T]^rr]}=#{[^rr]}- 



-N- 1 + 2 k\ 
k ) 


a + N — 2k 
m — k 



Next fix a x e Z> n . Now we have a polynomial in 
a 2 at both sides that coincide for a 2 e Z> n , so they 
must be the same, thus identity holds for a 2 e M. 


Proof (Split-lnduce-Merge) Instead of the de 
scribed method, use induction on N. The state 
ment is true for N= 0,1. Let N > 1. 


Finally fix a 2 e M. Now the polynomials in a x co- 
incide at a x e Z> n , so the identity is true for a x e M 
as required to finish the solution. □ 


Split the RHS of the statement as 


ELN =E 


a + 1 
m — k 


N—2 

-E 


-1 — k' 


Noting that the identity in this solution simply 
expresses the coefficient of x n in the binomial ex- 
pansion of (1 + x) ai (1 + x) ai = (1 + x) ai+a2 , we can 
give a short version of this solution restricting to 
the a x = a 2 = - 1/2 case only. 


Solution IV Let |4x| < 1, consider the square of 
the generating function ELo ( n)j^\ use the 
Cn) = ( _4 ) n ( n /2 ) observation and the binomial 
theorem twice with a = - 1/2 and then with a = - 1 
to prove (1) by comparing coefficients. 


EE 

n= 0 k =0 


2 n — 2 k\ 
-k ) 


/2 n 

, V n 


£ 

7 \_n=Q 

' oo / i \ "L 2 

E 7 +1 =[(*'-4*)-*] = 

,n=0 ' ' 

oo 

= ^4 n x n □ 


using Pascals recursion. 

Now use induction on the two new summations 
and relabel the running variable k’ = k + 1 to get 


E 


m — k 


= E 


/-iV + 2/A /ck + 7V-2A:\ 

v )\ m-k / 


_ f-N - 1 + 2k\ /a + N — 2 k\ 

ti\ k ~l )\ ™~k / 

Notice that the binomial weight in the last sum- 
mation for k = 0 is 0, so we can add this term to 
that summation. 


Collect ( a ^~ k 2k ) from the RHS and merge the dif- 
ference of the binomial coefficients into one using 
Pascals recursion. We are done after noting that 
for k = N the resulting binomial coefficient is 0 
(as N =k 0). □ 


Generalisation 

For a e M, k e Z call the real numbers ( k ) Pascal 
coefficients if they satisfy ( k t\) = ( k ) + ( a+ k) (Pas- 
cals recursion), i.e. binomial coefficients without 
specihed boundary conditions. 

In the combinatorial proofs, one key observation 
was to consider the sum of row 2 n in the classical 
Pascals triangle, now we will consider the arbi- 
traryrowsegment ( m -k) (for 0 <k<N) ofPascals 
coefficients. Note that = Ef=o (Ej)+“ fc ) 

for 0 < i < j < Ny so this row segment determines 
the triangle below it completely. In particular, set- 
ting i = 2k and j = k for 0 < k < N/2 (the result- 
ing Pascal coefficients have similar form as ( 2n n l 2 k ) 
in (1) in terms of k) we will have exactly enough 
independent equations to invert this linear sys- 
tem to express the sum of the row segment as a 


Corollary For a ly a 2 e M 



a 2 + 2n — 2 k\ 
n — k ) 


= E 

k =0 


a\ + a 2 + 2n + 1 
k 


Proof Notice {~ N ~k +2k ) = 0 when 

0<-N-l + 2 k<k (equivalently when 

^±1 < k < N), so choosing N such that Ly 4 <n<N 
(denoted as t), we can let 0< k< n in the summa- 
tion on the LHS in the lemma. 


To match the corresponding terms, we apply the 
lemma with a\=a x + a 2 +2n+ 1 e M, m:=n e Z> 0 
and N:=-a y - 1 G Z> 0 


E 


oc\ + 2k 
k 


a 2 +2 n — 2 k' 
n — k 


= E 

k =0 


a\ + a 2 + 2n + 1 
n — k 
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Let the Pascal coehicients be binomial coethcients 
in the natural way. Now the binomial coethcients 
on the RHS are 0 for n-k<0, but n<N from 
(t)> so we can let 0 <k<n in this summation as 
well. (t) expressed with the new parameters is 
-2n-2<a x <-n - 1, so we showed the statement 
for n + 2 different integer a/s (after reversing the 
order of the summation of the RHS). But in the 
statement there are two degree n polynomials in 
a x , so they are the same. □ 

/ \ \ N. A J@ L \ 

A notable special case is (cf. Fig. 

4). Now the RHS (being the half of row 2n+ 1) 
simplihes to 4 n and we get 

—a + 2 n — 2 k 
n — k 

Binomial Series 

This section sketches a deeper result giving a good 
insight to our main problem, (1). 

The above corollary states that the polynomial in 
variables a v a 2 of the LHS is in fact a polynomial 
in the single variable a x + a 2 . This generalises to 
arbitrary linear terms in the binomial coefhcients 
(see exercise 2) and is the main step in proving the 
following theorem. 

As in Solution IV, for a, /3^ R dehne the generat- 
ing hmction da,p($ '■= ('which is 

convergent if \x\ <Rp:= |/3- l|^ _1 /|^|^ (with 0°:= 1) 
independently of a. 

Theorem For a, /3 e R and \x\ <Rp there is 
a hmction from (-Rp, Rp) to M >0 given by 
such that 

n n (t _ Os/O 

^ 1 -v^)‘ 

This ^ also satishes x^(xY~ ^(x) + 1 = 0 and %p(x) 

£i+(-*) = 1 - 

Note that the exponential behaviour of G a ,o (cf. 
the binomial theorem) remains true for any ex- 
plaining (1) in depth. 

The additional properties of ^ can be used to de- 
termine its closed form (and hence that of Q a p) 
for some special values of [3 (and hence for 1 - jS). 
It happens to be simpler to consider the scaled 
%p(y)-=tp(Rpy) g ivin g 




< 2 (sinh 2 g = Uy) = 

<i(sin6») = |cos 2 (| - |) io(y) = 1 + 2/ 
£1 (y) = «l + j / 2 + yf. 



From this, the crucial ( 2 ”) = ( •!)"( „ 2 ) observa- 
tion in the algebraic solutions simply follows 
from Go, 2 (x) = Ue = So(-4x) -1 / 2 = G_i fl {-4x). 
Finally we get another proof for generalisation 
(3) as a corollary of the theorem: 


EE 


a + 2 k\ (—a + 2 n — 2 k' 


*)(■ 


b( x ) a &(s) a - n 


We conclude that our main identity (1) is just the 
equality of two series expansions of G a ^(x) 2 when 
a = 0, [3 = 2, so using the closed forms for some [3's 
above we can deduce similar identities. 


Exercises 

1. If N>0 and ( -l) k W N<k = (+) + (+_+), then 
x N + y N = w N,k-{x + y) N - 2 \xy) k 

0<k<N/2 

[proving the hmdamental theorem of symmetric 
polynomials in 2 variables constructively]. 


2. For fixed n and [3 eR 

fcx.i + (3k\ (a 2 + (3(n — k)\ 

k )\ n-k ) 

k =o v 7 v 7 

depends only on a x + a 2 . 

[Hint: use induction on n, for the induc- 
tion step use the polynomial argument 
and the Split-Induce-Merge technique for 

(U k ) = UIU + ( Ql+ +f _1) ) with induc- 
tion on a x £ Z> 0 .] 
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Contemplation, by Paul Klee 
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Diana Danciu 

Part III Systems Biolog 


A lan Turing, a graduate of the University of 
Cambridge, was a true yisionary and an 
important hgure in the Mathematics of the 
20th century. Mostly known as the "father of com- 
puter science and artihcial intelligence" and for 
devising methods to break German ciphers dur- 
ing World War II (including the famous Enigma 
machine), he also had important contributions 
in the held of Mathematical Biology. In his the- 
sis "The Chemical Basis of Morphogenesis" (1952) 
(see [1]), Turing explains how a system that is ini- 
tially homogeneous and stable can later develop 
spatial instabilities - the so-called "Turing insta- 
bilities" - that lead to the formation of spatial pat- 
terns such as leopards spots, zebras’ stripes and 
even peoples hngers. 

Morphogenesis is the part of embryology that 
studies the formation of pattern and form. The 
embryo, initially homogeneous, contains two 
types of chemicals called morphogens (an inhibi- 
tor and an activator), which can be produced by 
cells in the embryo (reaction) or can ditTuse into 
each other, giving the name of the following type 

of equations, a reaction-diffusion system 

= f(u,v) + Di\7 2 u 

t = g(u,v) + d 2 v 2 v 

where u and v are the concentrations of two mor- 
phogens; D ly D 2 are the two corresponding diffu- 
sion coehicients and/, g are the two functions de- 
scribing the reaction kinetics. Turings basic idea 
was that if, in the absence of diffusion (i.e. D x = 
D 2 = 0), u and v tend to a linearly stable uniform 
steady state, then spatially inhomogeneous pat- 
terns can evolve by diffusion driven instabilities 


(if Dj * D 2 ) under conditions which we shall ( 
duce. Thus, we start from a steady-state solution 
(m 0 , v 0 ) of the homogeneous system (D x = D 2 = 0). 
In other words, we have/(w 0 , v 0 ) = g(u Qy v 0 ) = 0. 
We consider small perturbations by setting 
u = u 0 + u(t) cos (kx)y v = v 0 + v(f) cos (Jcx) and lin- 
earizing the system about the steady state, mean- 
ing that we expand in Taylor series about (w 0 , v 0 ) 
up to linear order. We then get 

^ cos (kx) = 

u(t) cos(kx)^(u 0 ,v 0 ) + v(t) cos(kxm(u 0: v 0 ) 
ll cos(kx) — 

u(t) cos(kx) §2(uo, v 0 ) + v(t) cos(kx) gg(u 0 , v 0 ) 

Or, in matrix notation, by cancelling the cos(kx) 
factors 

'df df' 


qg_ 

du 


dg 

dv 


(uo,vo) 


where / is the Jacobian matrix, whose eigen- 
values tell us about the stability of the system. 
The system is stable if both eigenvalues are real 
and negative, which is equivalent to having 
D= det(J) =f u g v -f v g u > 0 and T = trace(J) =/„ +g v < 0, 
with the functions calculated in the equilibrium 
point (m 0 , v 0 ). This follows from the fact that the 
determinant and trace of a matrix are the prod- 
uct and the sum of its eigenvalues, respectively. If 
next we return to the inhomogeneous system, i.e. 
we add in the diffusion terms, then (by noting that 
W 2 u(t)cos(kx) = -k 2 u(t)cos(kx)) y we get a new so- 
called modihed Jacobian 

J m od=( fu n 

\9u 9v J \ 0 D 2 




What we need to do now is to inspect the trace 
and determinant of this new Jacobian and set 
conditions such that the system is unstable. In- 
deed, the trace is still less than zero, but the deter- 
minant has a more interesting form: 
det(J moc i) 

- ( D x g v + D 2 f u )k 2 + ( f u g v - fv9u) 

= Ak 4 - Bk 2 + C, 

having it written in the form of a quadratic func- 
tion in k 2 . In order for the new system to be un- 
stable, having the trace of its Jacobian negative, we 
also need its determinant to be negative, and thus 
we look at the discriminant of this quadratic. 

By inspecting the properties of the quadratic 
functions (see also Figure 1), we hnd B > 0 to be a 
necessary condition (i.e. some k gives instability) 
and B 2 - 4 AC > 0 to be a suthcient one (i.e. all k 
give instability), giving us B > y/AC . In conclu- 
sion, our condition for developing Turing insta- 
bilities is 

D\g v + D 2 f u > 2yJlyJh{Jfffg v - f v g u ) 

To give some examples of the theory and its gener- 
alizations, in 2D we can get regular planar tessela- 
tion patterns such as squares, hexagons, rhombi 
or triangles. These are solutions of 

V 2 ^ + k 2r ip = 0, (n • \7)f> = 0 for r E dB , 

where r = (x, y), dB is the closed boundary of the 
reaction-diffusion domain B, and n is the unit 
outward normal to dB. The following functions 


\ cos kx+cos{k(x cos 4>+y sin 0)} 

^fV) — 2 5 

where f is the rhombus angle, k = ± 1 ,... 

if(x, y) = cos kx , with k = mr, n = ±1, ±2,... 

represent, respectively, the solutions for a hexa- 
gon, a square, a rhombus and a one dimensional 
version of the square and can be seen in Figure 2 
(see [2] pp. 90-103). The beauty of Turings theory 
lies in its simplicity and in the diversity of applica- 
tions it may have: in addition to animal coat pat- 
terns, we can understand bacterial movements, 
cartilage condensation in limb morphogenesis, 
embryonic hngerprint formation, wound healing, 
or even growth of brain tumours - for a whole 
range of applications refer to [2]. 

o o 

4 ■ r 4 = 2* 


4 = ir 4 = 2* 


4 = * 4 = 2* 



Figure 2 Patterns arising from Turing Instabilities 






cos k( -^rr^ + i ) +cos k( — ^+cos kx 

tf{x,y) = -^-+-|- 

for k = nir , n = d=l, ±2,... 

ip(x,y) = S^kx+cosky ^ f or k = ±1, ±2,... 
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Figure 1 Plots of det(J mod ) against k 2 for various cases 
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A Nice heorem in 
Multiplicative Functions 

Masum Billal 

Fourth Year Engineering Undergraduate, UniversityofDhaka 


T he theorem to be discussed in this article 
is a nice and powerful one involving mul- 
tiplicative functions. We shall refer to it as 
the Multiplicative Function Theorem (MFT) 

We dehne /: N -> N to be a multiplicative func- 
tion if f(mn) = f(m)f(n) for any m coprime to n. 
Throughout this article, let/be a multiplicative 
function and the unique prime factorization of n 
be denoted by 

nmpl 1 •••p e k k 9 
with p ly .. .,pk distinct primes. 

Notation 

1. r(n) is the number of divisors of n. 

2. co(n) is the number ofdistinct prime factors of n. 

3. For a multiplicative hmction/ 

F (n) = E f( d ) 

d\n 

Let s call F the summation function of /. If p is a 
prime, then we have 

F( P y = j2fU) = irf(ph. 

d\p a i =0 

4. For positive integers n and p, p a \ \n , or alterna- 
tively v p (n) = a, means thatp a |n whereasp a+1 tn. 

Pi^ 


Our Core Theorem 

Theorem (We use the notation F and/as dehned 
above.) Let/be a multiplicative function. 

If F(n) = ^2f(d) t then 

d\n 

f (»)=(i + f( Pl y +... +f( P n))... (i +f(p k )+... +f(p k u) 

=ni:/w) 

*=1 3 =0 

= f[F( P T\ 

i=1 

In other words, if/is multiplicative, then so is F. 

Proof Let T be the expansion of the right side of 
the equation, and 

s = J2fU) 

d\n 

If d\n is a divisor of n , then d = p x Wi . ..p w \ where 
0 <w t < e { for 1 < i < k. Then we have 

f(d) = f(pT)---f(p w k u 

1/(pD •••/(??). 

which is a term that is present in T. Thus, we con- 
clude that each term of S is a term of T. Now we 
easily find that the converse is also true, since, af- 
ter multiplying, we see that every term in T is of 
the form / (p/ 1 ) • • • / (pj/), which can be written as 
/ (Pi 1 ---Pk k ) or f(d). Therefore, every term of T is 
a term of S. Combining these two, S = T. 
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Problems 

We see some applications of the theorem by solv-1 
ing some problems. The crucial fact deduced from 
the theorem is: if we can hnd the value of F(p a ) 
for a positive integer a, we are done. And more 
satisfying, to do that we need to find the value of I 
f(p a ) only. Here are some examples. First we see a 
derivation of the number of divisors formula. 

Example Problem 1 Find the number of divi- 
sors of n. 

Solution Note that, if we set f(n) = 1 for all n e N 9 
then/is multiplicative. Then it is obvious that, 

E 1 = T ( n ) 

d\n 

Also, since/is multiplicative, we can invoke MFT. 
Using this, 

wijE/w 

d\pf 

= /(i) + /(ri) + .-f + /(p?) 

Therefore, r(n) = (e x + 1)... (e k + 1). 

Example Problem 2 (Generalisation of sum of 
divisors) Let o(n) be the sum of divisors of n. Let I 
o r (n) be the sum of r th powers of the divisors of n, 
That is, if {d lt d 2 ,..., d T(n) } are divisors of n, then 


o r (n) = d\ + ... + d\ 


'fiO. 


Prove that 


r(n) = 


p )el+1)r -1 

Pl -1 


pir* +1)r -! 

Pk~ 1 . 


Solution Set f(n) = n r . This is multiplicative, since 
f(mn) = (mn) r = m r n r = f(m)f(n) for any m, n e N 
(and in particlular for m coprime to n). MFT gives I 

a r (n) = E+ = f( d ) 

d\n 

F(pi) = 1 + f(Pi) + •.. + f(p?) 

= 1 +pl + ...+p? r 
P? i+1)r - 1 


Therefore 


a r (n) = 


Pi- 1 


Pi ei+1)r -1 p! et+1)r -1 
pi -1 Pk -1 . 


Note. The formula for the usual sum of divisors 
follows if we set r = 1. 

Example Problem 3 Prove that 

E F( d ) = n 

d\n 

where (p(n) is the Euler function. 

Solution 

It is well known that (p is multiplicative (we don t 
^ prove it here), so we can invoke MFT here. 

i F (p) = E p( pi ) 

= i + (p-i)+p(p-i) + ---+/ _ 1 (p-i)| 

= i + (p - 1) (i +p +... +p e_i ) 

= i + (p _i)^ e - r 


=f 

Hence 


P~ 1 


E V(d) = n P e = n - 

d\n p\n 


Example Problem 4 The Mobius Function p(n) 
is defined by 

! 1 if n = X 

(_1)^0) jf n f§ square-free 
0 otherwise 

Prove that /i(d) = 0 for n > 1. 

d\n 

Solution First, note that, for a primep, pi(p a ) = 0 
for a > 1, since it isnt square-free. Therefore 

F(p e ) = p(l) + /i(p) + 0 + ... + 0 
= 1-1 
= 0 

since p(p) = (- 1) 1 = -1. Therefore 

^/i(d)= = °- 

d|n p\n i =0 

Exercise Prove that 

E>(d)/(<Q = If(i-/(p)). 

d|n p|n 


Exercise Prove that 

d\n 

where x(n) is the number of divisors of n. 


= n 
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The Disc Planimeter 
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N owadays, when we need to know the area 
of an irregularly shaped hgure, the easi- 
est solution is to ask for the help of our 
everlasting work companion: the computer. But 
this was not always the case. In the last century 
brilliant mechanical devices provided great accu- 
racy in the measuring of the area of irregular fig- 
ures drawn over paper. The aim of this article is to 
show that it is possible to measure the area of an 
irregular figure without the need for a PC or any 
complex mechanical planimeter. The method put 
forward is a simple one (a geometrical tool) which 
obviates the need for other rough and obvious 
methods like counting squares over graded paper. 

Mathematical Background 

In order to achieve this aim, we hrstly must con- 
sider that every planar figure limited by a closed 
curve may be expressed as a function R = p(6) 
in polar coordinates, no matter where the point 
O - the origin of coordinates - is, external or in- 
ternal to the curve. 

Then, the area of this figure can be calculated by 
means of the following integral: 

1 C 27T 

S= 2 j 0 p2{9) d9 ■ (D 

Note that d 6 is positive when the angle is counter 
clockwise, and negative when clockwise. This in- 
tegral must in general be solved numerically. The 
rectangular method is the simplest of the eligible 
numerical methods, and has been chosen in this 


paper for this reason. In polar coordinates, the 
analogous approximation to that which would 
be performed in Cartesians by rectangles uses 
circular sectors of constant radius p t and constant 
angular amplitude A 6 { = A 6. The integral that 
quantifies the area of the figure may then be ap- 
proximated by the algebraic sum of the areas of 
the N circle sectors: 

ao,= 

i=1 ( 2 ) 

i =1 i =1 

where N = 2nlk6 (the total number of sectors to 
take into account) and M = 1 if Afy < 0, or M = -1 
if A 6 { > 0. The radius p { represents the length of 
the intersections between the contour of the fig- 
ure and the N rays equally spaced in angle that 
originate from O. A graphical explanation of this 
approximation can be seen in Figure 2. 

In order to obtain this sum directly from the oper- 
ation of a mechanical device, we need to linearise 
the sum - converting the squares into quantities 
capable of being added directly. This is possible 
using the Fermat spiral. In a Fermat spiral, the ra- 
dius at any point is proportional to the root of the 
angle, as shown in Figure 1. 

Consider now a double Fermat spiral, centred at 
the same point O and with the same origin of an- 
gles, defined as: 
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Figure 1 The Fermat spiral 


Figure 2 Approximation of a polar integration using 
sectors, showing the double Fermat spiral superim- 
posed over the ray / and the rotation angle 


Ti(0) = y/k 2 (0 - Oi), - a < 6 - Oi < (3) 

where is the orientation of ray i in the discreti- 
zation, with radius p t . If we rotate r t (0) by \f/ it of the 
same sign (and therefore of the same orientation) 
as A Oi, so that the following equation is verified: 

n(0i - ipi) = p(0i) = p iy ( 4 ) 


then 


a Jk 2 \(0i - \pi) - 0i\ = pi 


h \A\ = Pi m 


(5) 


If we repeat this operation over all the rays of the 
approximation of the polar integral, and substi- 
tute in equation ( 2 ), we have: 

N N 

( 6 ) 

i=1 i =1 


That is, we have stated that the unknown area is 
approximately equal to an algebraic sum of angles. 
Having reached this point the question is whether 
it is now possible to configure a new device to 
quantify this sum of angles. 

The Device 

The device consists of three transparent sheets, su- 
perimposed on top of each other. The three sheets 
are joined together so that they can be rotated 
with respect to each other. The shape of the sheets 
will be such that it is possible to rotate the third 
sheet without rotating the first and second. To 
facilitate this, the third sheet may have a smaller 
diameter. Furthermore, a rotation of the second 
sheet must imply that the third one rotates jointly 
with it. This can be achieved in a simple way by 
some sort of frictional contact between the two 
sheets. 
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Figures 4-7 Example of the measurement of a small figure. (1) Second sheet; (2) Third sheet; (3) Angular graded 
scale; (4) Origin of angular measurement (only shown outside thehrst sheet); (5) Ray acting as symmetry axis; (6) First 
branch of the Fermat spiral; (7) Second branch of the Fermat spiral; (8) Figure. 


On the hrst sheet, a full set of equally spaced rays 
is plotted originating from the centre of the sheet. 
The angular amplitude between two consecutive 
rays corresponds to the value of A 6; the centre 
of the sheet (and indeed that of the other sheets) 
corresponds to the polar origin O (in the math- 
ematical discussion above). On the second sheet, 
a single ray with the same centre plays the role 
of the origin of angular measurement. Finally, on 
the third sheet, the double Fermat spiral 

td2 

t(9) = — a < 6 < a, \a\ = — (7) 

K2 

is plotted, with R being the radius of the sheet. On 
this sheet an angular graded scale, also centred 
on O, and a single ray that acts as the symmetry 


axis of the double Fermat spiral are also plotted. 
In Figure 3, a blown-up perspective of the device 
can be seen. 

Using this device, the polar function dehned by 
equation (3) may be formed over any hgure, by 
means of the transparency of the sheets, simply by 
rotating the second and third sheets (while main- 
taining the hrst sheet stationary on the paper) un- 
til the ray of the third sheet overlaps ray p { . 

The angle i// f is then calculated by graphically solv- 
ing equation (4). This can be achieved by rotating 
only the third sheet, with the same orientation of 
angular increase A 6 t , until the appropriate branch 
of the double Fermat spiral plotted on the sheet 
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Figure 3 A blown-up perspective of the device 


overlaps the point of intersection between the 
contour of the hgure and the ray p { from the full 
set of rays plotted on the hrst sheet. 

The angle i// f is quantified by the angular ampli- 
tude between the origin of the angular measure- 
ments plotted in the second sheet and the ray of 
the third sheet. 

When a new measurement is taken for ray p i+l 
following this procedure, the hrst angle i// ? will 
remain recorded, provided care is taken that the 
relative rotation between the second and third 
sheets is zero when overlapping the ray of the 
third sheet and the ray p i+1 . As mentioned above, 
some sort of frictional contact between the sheets 
will ensure this occurs. 

The new angle f i+1 is then added to i// z . Therefore, 
at the end of the process, when all the rays that 
intersect the contour of the hgure have been ac- 
counted for, the sum (6) will be shown in the an- 
gular graded scale of the third sheet as the total 
angular amplitude between the rays of the second 
and third sheet. If this angular scale is properly 
arranged, then the value that is read in the scale 
will be the unknown area of the object of interest. 

In Figures 4-7 an example is shown of the meas- 
urement of a small hgure, following the steps as 
explained above. In this case, a simple ray is used, 


and two angle measurements (one positive and 
the other negative) are carried out because the 
polar centre is located outside the hgure. 

The design parameters of the device are the fol- 
lowing: 

• R, proportional to the size of the device; 

• ki = proportional to the angu- 

lar discretization plotted on the hrst sheet; 

• k 2 , the coefhcient that affects the shape of the 
double Fermat spiral plotted on the third sheet. 

The hrst parameter does not affect the quality of 
the measurement, but the accuracy of the device 
is inversely proportional to k x and k 2 . The draw- 
back is that the maximum number of rotations is 
inversely proportional to these parameters too. In 
order to avoid the implementation of a revolution 
counter, it is necessary to limit their value. This 
necessarily limits the accuracy of the device. 

Condusion 

Through experimental research done with a 
number of different designs of the device, it has 
been demonstrated that it is relatively easy to 
obtain a measurement of an irregular planar fig- 
ure to within a precision of 5%. This accuracy is 
achieved by arranging a set of rays on the hrst 
sheet of relatively low density (as small as 7.5° 
of angular amplitude, equivalent to a parameter 
k x = 0.06545 rad) and a parameter k 2 equal to 
363 cm 2 /rad, with a radius of the third sheet equal 
to R = 19.5 cm. The device, though somewhat 
complicated to explain mathematically, is actually 
extremely simple and easy to operate. 

In conclusion, it has been demonstrated that, even 
today, there is still scope to add to the wide range 
of geometrical devices in existence. As has been 
seen, preparing a device for the manual measure- 
ment of areas is a relatively easy task which takes 
us back to an age prior to the dominance of CAD 
and GIS techniques. 
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T oday, the use of mathematical modelling to 
help understand the intricacies of biological 
systems has become common place. Biolo- 
gists now undoubtedly appreciate their multipli- 
cative utility as a complement to wet-lab research. 
Traditionally, the approach to handling these 
dynamical systems has been based on the use 
of deterministic ordinary differential equations 
(ODEs). However, in recent years, it has become 
apparent that ODE models often fail to fully ex- 
plain how complex biological systems truly work. 
Therefore, the increased deployment of stochastic 
models seems paramount. In this article I hope to 
briefly detail not only what a stochastic model is, 
but also why and when they should be used. 

What is a Stochastic Model? 
Why do we Need Them? 

Informally, a stochastic model is any model for 
which the solutions trajectory through time is not 
certain. That is, it is probabilistic in nature. This 
places them in clear contrast to the more familiar 
ODE, for which suitable initial conditions deter- 
mine the solution for all time. But why do we need 
probabilistic representations? Why cant we just 
use deterministic models for biological processes? 
Well, the fact of the matter is that all such systems 
are under the intluence of ‘noise’ affecting the ac- 
curacy of any model. This noise is generally divid- 
ed into two categories. Extrinsic noise introduces 
uncertainty due to external environmental factors. 
For example, for cellular systems this could mean 
the cell cycle stage. In contrast, intrinsic noise is 
due to small numbers of molecules; this proyides 
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an uncertainty of knowing when a reaction will 
occur and which reaction it will be. You see, ODE 
models implicitly assume compartments are well 
stirred and that the abundance of all species is 
high enough to permit fluctuations to be ignored, 
but this is often not the case. 

Now, stochastic models actually come in various 
forms, depending upon whether the dependent 
variable and the time variable are continuous or 
discrete. Here we consider a discrete dependent 
variable with continuous time. 

The Degradation Model 

As a most basic introduction to stochastic model- 
ling we look to the ‘degradation model’ frequently 
used in chemistry to model the breakdown of a 
chemical species. Deterministically it takes the 

fOTm: dC 

— = -kC, 
dt 

where C is the chemical of interest, k is a rate con- 
stant, and the initial condition C(0) = C 0 has been 
used. 

For the stochastic modeller however, the formal 
dehnition of k is recalled: the probability a ran- 
domly chosen molecule of species C degrades in 
the interval [t,t + St) is given by kSt, with St an 
inhnitesimally small time step, and so the prob- 
ability exactly one molecule degrades in this in- 
terval is given by C(t)kSt. This is actually all that 
is needed to design a stochastic simulation al- 
gorithm (SSA). The number of molecules of the 
chemical species, C(t), at times t = hAt, for some 










pre-set small time step A t and h = 1, 2, 3, . . is 
found using the following algorithm: 

1. Set t = 0 and C(t ) = C 0 . Choose A t. 

2. Generate a random number, r, according to 
the uniform distribution on (0, 1). 

3. If r < C(t)kAt then C(t + A t) = C(t) - 1, else 
C(t + A t) = C(t). 

4. Repeat Steps 2 and 3 for t = t + At. 

This works since r ~ Unif(0,l), and therefore the 
probability that r < C(t)kAt is equal to C(t)kAt. 
Thus Step 3 implies the probability of a single mol- 
ecule degrading is C(t)kAt , as required. Figure 1 
depicts several realisations of the stochastic mod- 
el as well as the deterministic. Each time the al- 
gorithm is run, a different result is achieved. One 
may therefore reasonably ask what useful infor- 
mation can be drawn from a stochastic model. In 
most cases this is done by computing the average 
and variance across a large number of simulations. 





Figure 1 The stochastic and deterministic versions of 
the Degradation Model. Four realisations of the stochastic 
system are shown in various colours. In addition, the solu- 
tion to the deterministic model is provided in black. Here, 
C(0) = 20 and /c= 0.5. 

For our case of the degradation model however, 
we can illustrate an interesting point by comput- 
ing the stochastic mean across inhnitely many re- 
alisations, as follows. 

We dehne F{w, t) to be the probability that at 
time t there are n molecules of chemical species 
C. Then, we consider an inhnitesimally small time 
step 8t such that the probability more than one 
molecule degrades during [t,t + 8t) is negligible. 
Now there are two ways in which at time t + 8t 
there can be n molecules. Either at time t there 
were n molecules and no reaction took place, or 


at time t there were n + 1 molecules and in the 
interval [t, t + 8t) one molecule was degraded. 
Mathematically this can be formalised as: 

F (n,t + St) = F(w, t + <5f)(l - knSt) 

+F(w + 


=> —¥(n ,/) = k(n±\)W(n + \,X)-kn¥(nJ), 
dt 

where we take the limit as <5/ 0. This equa- 

tion is usually called the Chemical Master Equa- 
tion (CME). Numerical solution of a CME is 
often extremely computationally expensive. It is 
in this case though, possible to solve the above al- 
gebraically. Recalling that C(0) = C 0 , and that the 
chemical species is only able to decay, it is easy to 
see that F(« s / ) = 0 for n > C 0 ; leaving us with a 
system of C 0 + 1 coupled linear differential equa- 
tinne Rpaimiing with the ODE for P(C 0 , f), then 
F(C 0 - 1,/), it is possible to formulate a hypoth- 
esis for the form of F(«, t) which can be solved 
inductively to hnd: 

, . (C\ 


F(« ? 0 = e 




Principally, we are usually interested in the aver- 
age number of molecules at time t, which is de- 
hned using the nsnal fnrmnla fnr expectation: 

M(t) = X«P(«»0. 

In this case, having explicitly found F(w, f), it is 
possible to substitute in to find M. However weTl 
here illustrate a more general technique for ana- 
lysing the CME. We multiply our CME across by 
n and sum to obtain: 


—2+p(n,f)=* 
dt k=o 


•1)«P(»,0 


+ I)P(» +1,1) 

11=0 

-2> 2 P0u) 
fi=0 J 

=4i(«-1 

\-n=Q 

-I« 2 P(«,0 

fl =0 

20 

= -*£«P(«,0, 


i.e. we havp- 

— M = -kM => M(0 = C 0 e~ k ', 
dl 












using the condition M(0) = C 0 . Thus the mean of 
the stochastic system is equal to the solution of 
the corresponding deterministic model. This is 
actually true in general for linear systems, and for 
non-linear systems the ODE model can act as a 
hrst approximation to the mean of the system. 

The SIR Epidemic Model 

To really appreciate the utility of stochasic models, 
we now consider a more complex system: the SIR 
model for an epidemic. The SIR model has proven 
extremely useful since its creation, and has been 
successfully used to model the spread of numer- 
ous diseases, providing simple rules for how the 
number of Susceptibles (S), Infectious (I), and 
Recovereds (R) changes through time as a disease 
Spreads Tts dptprministir fnrmnl?>tinn is criven by: 

dS dl dR 

— = -pSl— = pSI-yl— = yI t 
dt di dt 

with S(0), 1(0), R( 0) > 0 and S(0) + 1(0) + R( 0) = 
N. Here, jS represents the transmission rate of the 
disease, and y the recovery rate. 

The basic reproductive ratio, R Q , is roughly de- 
hned as the average number of secondary infec- 
tions that occur when one infective is introduced 
into a completely susceptible population. For the 
SIR model it takes the form: 


R 0 = -N. 


Its importance lies in the fact that if R 0 < 1 then no 
epidemic can occur; the solution for I(t) decreas- 
es monotonically to zero. However, if R 0 > 1 then 
I(t) hrst increases; i.e. an epidemic occurs. 

To stochastically model this system, we again turn 
to the formal dehnitions of the parameters in the 
model, and make use of our SSA from earlier to 
produce Figure 2. The key point to notice is that 
for one of the stochastic realisations no epidemic 
has occurred. This is despite the fact that R 0 is 
greater than 1, where in a deterministic setting 
an epidemic would be guaranteed. It seems logi- 
cal that an epidemic should never be certain just 
because a few simple conditions are met. This is 
therefore just one example of how incorporation 
of noise can help more accurately model biologi- 
cal systems. 


Condusion 

What does this mean for the future? Should ODE 
models be abandoned entirely? Well obviously 
not. There are a few issues to consider: Firstly, the 
scale and stability of the system being modelled 
must be taken into account. Whilst stochastic ef- 
fects are inherent in both the micro and macro- 
scopic worlds, in reasonably stable environments, 
especially at the macroscopic level, ODE models 
do perform very well. Perhaps a greater issue 
though is that there are still many systems for 
which we simply do not know enough to utilise 
stochastic models, in which the simplihed ODE 
version becomes a necessity. These two alterna- 
tives should therefore be used to assist each other 
in obtaining a greater understanding of biological 
mechanisms. 
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Figure 2 The stochastic and deterministic versions 
of the SIR Epidemic Model. Four realisations of the 
number of infecteds for the stochastic system are 
shown in various colours. In addition, the solution to 
the deterministic model is provided in black. Here, 
/3 = o.03, y = 0.01,5(0) = 29, 1(0) = 1 and R( 0) = 0. 
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he division algorithm has many important 
consequences. For example, well-known 
proofs of: 

1. The highest common factor of two natural 
numbers exist, and is their least positive linear 
combination 

2. Z is a principal ideal domain 

3. The minimal solution to Pells equation gener- 
ates all solutions 

all follow the same pattern and use the same idea, 
that of the division algorithm. (Also note that 1 
and 2 are essentially the same.) If you have no- 
ticed how the division algorithm applies in these 
cases, you might have wondered: in which other 
cases can we apply the division algorithm? What 
is the most general case where we can apply it? 
In this article we shall attempt to answer these 
questions using groups instead of the general ring 
theory approach. 

The Set-up 

Say S is a set where the division algorithm applies. 
We dehnitely need some sort of order in S to say 
that the 'remainder' must be 'less than' the 'divi- 
sor'. We might want S to be closed under some 
operation (so that we can repeatedly 'subtract' the 
'divisor' from the 'dividend') and we also need an 
inverse operation (i.e. the 'subtraction'). 

Such an order on S needs to be a partial or- 
der. [Recall that a partial order on a set S is a 
binary relation < on S that is (i) reflexive: a < a; 
(ii) antisymmetric: if a < b and b < a then a = b; 


and (iii) transitive: if a < b and b < c then a<c; for all 
a, b, c e S.] In addition, the closure and inverse 
operations suggest that we take S to be a group. 
How should the order behave under the group 
operation? Clearly we want it to be compatible 
with the operation. We also probably want the in- 
verse of a 'positive' element to be 'negative', and 
vice-versa. Do we need the group to be abelian? 
Maybe, but let's not impose all the conditions yet. 

So, to put our ideas into action, let (G, +) be a 
group with a partial order < such that, for all g , 
gi>gi e G, & < g 2 implies g + & < g + g 2 and g t + 
g < g 2 + g. We say that an element g e G is posi- 
tive if 0 < g, where 0 is the identity (zero) element 
of G. Define the positive cone of G to be the set 
G + := {g e G : 0 < g} of all positive elements. 

Moving on from Definitions 

Now we hopefully have the necessary axioms in 
place. Let's see if we can prove anything from 
these. The hrst thing that we want is probably: if 
0 < g, then -g < 0. This follows easily: 0 < g, add 
-g to both sides and we are done. It works the oth- 
er way around as well, so in fact we have proved: 

Lemma 1 0 < g iff -g < 0. 

That was good. We wanted the inverse of positive 
elements to be negative and it just followed from 
the dehnition. But we want more! So take a non- 
zero element g e G. By Lemma 1 we can take g to 
be positive without loss of generality. How about 
adding something to both sides of 0 < g again? Last 
time we added -g. We can add 0, but that doesnt 




1101010 




change anything. So lets addg: g < g + g = 2g. Now 
what? Let's add g again! g + g < 2g + g, i.e. 2g < 3 g. 
Combining the last two gives 0 < g < 2g < 3g. It 
follows by induction that mg < ng for all integers 
0 < m < n (note: here ng = g + g+ ... + g, n times). 
This looks promising. 

What about 'negative' elements? Note that by Lem- 
ma 1, -g < 0. Adding-gtobothsidesyields -2 g< -g, 
and so on. So we have another nice result: 

Lemma 2 mg<ng for all integers m < n andg e G + . 

It seems that this is all we can derive from our first 
principles. So we want to apply more restrictions 
on G. Let a, b e G be positive with b < a. As in the 
division algorithm, let's look at a - b, a -2b, . . . 
etc. We want this sequence to stop as soon as 
a - nb becomes negative. How do we do this? In 
other words, we want the set {a - nb : n e Z} to 
have a least positive element. Did something just 
pop up in your mind? A set having a least element 
must have reminded you of something like... the 
well-ordering principle! (In case you don't know 
what it is, it is basically the statement that the 
natural numbers N are well-ordered; that is, every 
non-empty subset of N has a least element.) So 
how about we impose the extra condition that < 
is a well-order on G? But hang on. This is clearly 
absurd if we think about it for a minute, as for any 
positive element g the set {ng: n e Z} has no least 
element. How about the least positive element 
then? In other words, let's say G + is well-ordered 
under <. 

Now G has quite a few nice properties: it is a group 
under +, < is an order on G preserving +, and its 
positive cone is well-ordered. Let's see if our ideas 
work now! 

The Generalisation 

Let d be the least non-zero element in G + and g e 
G be any non-zero element. Without loss of gen- 
erality, 0 < g. Then g e G + so d < g. Consider the 
elements nd for neTL. We want nd < g < (n + l)d 
for some n. Can we achieve this? We certainly have 
nd<g for some n = 1, so we need g< n'd for some 
n'. By Lemma 2, n' must be greater than n. How do 
we know that n' exists? 

Suppose it doesnt. Then nd< g for all sufficiently 
large n, so nd < g for all n e Z by Lemma 2. Then 
0 < g - nd, g - nd e G + for all integers n. Hence 


1 {g - nd: n e Z} cz G + , so it has a least elementg - md. 
Then g - md < g - nd for all n, which implies 0 < 
(m - n)d for all integers n, a contradiction. 

Now we can take the maximal n such that 
nd < g. Then g < (n + 1 )d, so nd < g < (n + 1 )d. 
The left inequality says g - nd e G + , and the 
right inequality says g - nd < d. So g - nd = 0 
and g = nd. This is exactly what we wanted. 

We have shown that G = (d). In fact we can do 
more. Clearly G cannot be finite. Because other- 
wise d must have finite order, i.e. kd = 0 for some 
positive integer k. Then 0<d<2d<...<kd = 0 
by Lemma 2. So all of these must be equalities (by 
antisymmetry), i.e. d = 0, a contradiction. 

So our restrictions have not only worked, we've 
shown that all groups with these properties essen- 
tially have the same structure, that of the infinite 
cyclic group. Let's give G a name: we say that the 
group G is well-ordered if the set G + is well-ordered 
under <. We have thus proved: 

Proposition 1 The only non-trivial well-ordered 
group is the group (Z, +) of integers (up to iso- 
morphism). 

Corollaries 

Now we can give one-line proofs of the facts stated 
at the beginning using Proposition 1: (here any or- 
dering is under the usual < order in R) 

Corollary 1 The highest common factor of two 
natural numbers exists, and is their least positive 
linear combination. 

Proof For a,be N, the additive group G = {ax + by 
: x,ye Z} is well-ordered, and so is equal to (d) for 
d the least positive element of G. 

Corollary 2 Z is a principal ideal domain. 

Proof Any ideal in Z is a well-ordered group, and 
so must be (d) for some d. 

Corollary 3 If x 0 + y 0 Vd is the least solution > 1 to 
Pells equation x 2 - dy 2 = 1, then all solutions are 
given by x n + y n ^d = (x 0 + y 0 ^ld) n for n e Z. 

Proof The solutions x n +y n Vd to Pells equation 
form a subgroup of the multiplicative group of 
units in the ring Z [Vd], which is well-ordered. 

Exercise Show that every discrete subgroup of 
(R, +) is inhnite cyclic. 
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Minimum Clues: 
Sudoku and Sudokion 

Stephen Jones 

Co-founder, Muddled Puzzles 


I make Sudokion, a large family of pure spatial- 
logic puzzles derived from Sudoku. The puz- 
zles range in size from 25 cells (5x5 grid) to 
369 cells (five interlocking and interdependent 
9x9 puzzles). Sudokions rules are the same as 
those governing Sudoku - every row, column and 
cluster must contain each of the numbers 1,.. n, 
for a grid measuring n cells by n cells. 


simplest grid shape for a 3x3 puzzle is three rec- 
tangles side-by-side. While the puzzle may have 
three clues that do not produce a unique solution, 
ideal selection of two (vital) clues allows a unique 
solution (see Figure 1). 

Small Puzzles: 

Compensating Symmetry 


Over the years I have developed an interest in the 
extent of complexity required to achieve absolute 
economy of clues in puzzles of increasing grid size. 
I define absolute economy as n - 1 clues, where n 
is the number of cells in any row, column or clus- 
ter of the puzzle. 

On lst January 2012 Gary McGuire, of University 
College, Dublin, announced that, after 7.1 million 
core-CPU hours on a supercomputer, using a hit- 
ting-set algorithm, his team had established that 
a traditional 9x9 Sudoku requires a minimum 
of 17 clues to guarantee a unique solution (see 
[1]). This reminded me that, even if they are not 
necessarily interested in the puzzles themselves, 
some mathematicians at least are interested in the 
mathematics behind the puzzles. 

Very Small Puzzles 

Puzzles of grid sizes 1x1, 2x2 and 3x3 permit 
absolute economy easily. Clearly, in the case of a 
1x1 puzzle no clue is required as only one value 
(1) is possible. For a 2x2 puzzle, with two rectan- 
gular clusters, only one clue is ever required. Tbe 


In order to achieve absolute economy, ideal selec- 
tion of vital clues is required for all puzzles of grid 
size greater than 3x3. 

In the case of a traditional Sudoku, with four 2x2 
boxes (Figure 2), three clues in any arrangement 
are insufficient to provide a unique solution. But 
when the square boxes are replaced by a Sudokion 
called Logikion (Figure 3) the irregularly-shaped 
clusters allow a 4x4 puzzle with only three clues, 


2 








1 


Figure 1 A 3x3 grid with absolute 
economy of clues 
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Figure 2 4x4 Sudoku Figure 3 4x4 Logikion 
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provided that the shape of the clusters and the se- 
lection of clues are ideal. So, too, when the grid 
size is increased to 5x5 (Figure 4), a well-con- 
structed Logikion will allow absolute economy of 
four clues. 

The irregular shape of the Logikions clusters 
tends to promote a feature that I call compensat- 
ing symmetry. Figure 5 illustrates four compensat- 
ing symmetries of the Logikion shown at Figure 
4. Sometimes there is a direct correlation between 
two individual cells, as is the case with the two 
ls in Figure 5.1. At other times a group of two or 
more cells are correlated in some combination. 

Puzzles containing many compensating symme- 
tries tend to allow greater economy of clues. A 
strong group of compensating symmetries is usu- 
ally the basis for eutaxy - an ideal arrangement of 
irregularly-shaped clusters combined with ideal 
selection of clues - a combination most likely to 
produce as economic a puzzle as possible. 

Medium-sized Puzzles: 
Fragmented Clusters 

The achievement of a 6x6-grid puzzle with five 
clues requires more complexity than the Logikion 
can offer. Experience tells me that the 6x6 
Logikion in Figure 6 is about as far as five clues 
will stretch. 
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Figure 5.1 & 5.2 Compensating Symmetries 


Figure 6 6x6 Logikion Figure 7 6x6 Pandemonion 
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Figure4 5x5 Logikion Figure 8 7x7 Pandemonion 

The Pandemonion (Figure 7), with one com- 
pletely-fragmented cluster, is the least complex 
6x6 pure spatial-logic puzzle that will allow five 
clues and a unique solution. Likewise, a 7x7 Pan- 
demonion (Figure 8) is capable of as few as six 
clues. 

The Pandemonion offers scope to distribute the 
fragmented cells anywhere in the grid, thus al- 
lowing a eutaxy greater than that available to 
Logikion and, therefore, more scope for absolute 
economy. 

Introducing the Plus Factor 

All the Logikion and Pandemonion puzzles il- 
lustrated thus far are presented in plain format’ 
- the unadorned puzzle. I also create plus-format’ 
Sudokion, puzzles upon which are superimposed 
a line or lines that must contain all the values of 
the puzzle. 

My experience of making a very wide range of 
Sudokion has convinced me that seven clues for 
any plain-format 8x8 Sudokion will never pro- 
duce a unique solution. So, it is to plus-format 
Sudokion that we resort in order to seek absolute 



Figure 9 8x8 Diagonal Katastrophion: every row, column 
and cluster, including the tragmented green and pink 
clusters, and the red diagonal line must contain the num- 
bers 1 to 8. 
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economy in puzzles with grid sizes 8x8 and 9x9. 


In the Diagonal Katastrophion (Figure 9), the 
combination of the extra plane available to the 
values on the diagonal and the puzzles essential 
eutaxy contribute to its absolute economy of sev- 
en clues. 

The Holy Grail: 

81 Cells, 8 Clues 

Before beginning this article, the most economic 
9x9 puzzle I had made was a Parallelogram Pan- 
demonion (Figure 10). I had high hopes that this 
type of puzzle would be able to produce an exam- 
ple with just eight clues but, having made 35 ex- 
amples with only one example having nine clues, 
I am almost certain that nine is as good as it gets. 

When the opportunity to contribute to Eureka 
arose I decided to go for the Holy Grail, a 9x9 
puzzle with only eight clues. The Para-X Pande- 
monion (Figure 11) is the result. The four planes 
created by the superimposed lines, the compen- 
sating symmetries, the nine linear intersections 
and ideal selection of clues combine to give the 
puzzle a unique solution with only eight clues. (To 
date I have made 18 Para-X Pandemonions, three 
of which contain only eight clues.) 

I have almost run out of space but there is just 
enough to state that, for a number of reasons, a 
10x10 plus-format Sudokion with nine clues is al- 
most certainly impossible. 

For more Sudokion puzzles and a brief ex- 
planation why a 10x10 Sudokion with nine 
clues is “impossible” please visit my website, 
www.muddledpuzzles.com. 

References, Further Reading 

[1] Eugenie S. Reich; 2012; Mathematician claims 
breakthrough in Sudoku puzzle; Nature; 
http://www.nature.com/news/mathematician- 
claims-breakthrough-in-sudoku-puzzle-1.9751. 

[2] Wikipedia; Mathematics ofSudoku ; 
http://en.wikipedia.org/wiki/Mathematics_of_ 
Sudoku. 
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Figure 10 9x9 Parallelogram Pandemonion: every row, 
column, cluster, (including the tragmented green cluster) 
and both the upper and lower red 'V' lines of the paral- 
lelogram must contain the numbers 1 to 9. 
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Figure 11 9x9 Para-X Pandemonion: every row, column, 
cluster (including the tragmented green cluster), both red 
diagonal lines and both the upper and lower red 'V lines of 
the parallelogram must contain the numbers 1 to 9. 
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Small Stellated Dodecahedron, by 
Vladimir Bulatov 

A small stellated dodecahedron is a nonconvex 
polyhedron, existing in C 2 . It is formed of five 
pentagrams (stars) intersecting at vertices. The 
sculpture reAects the true form of the shape, 
with no extra intersections between faces. 


See www.bulatov.org for more of Vladimir's pieces. 
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So Jar, So Good 

This speaker-in-a-jar is completely self-contained, so you can bring 
it on-the-go with you for impromptu dance parties (if mathmos 
y happen to evolve to be good at dancing of course), and can hook 

it up to pretty much anything. Apparently nothing says geeky’ like . ^ ^ ^ 
using simple physics in everyday life. J — 
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The Opening Number 

TBis mathematical bottle opener is sure to become an operational 
constant when you re seeking liquid inspiration. The hefty steel 
form is calculated for optimum leverage, solving stubborn bottle 
tops with near-inhnite ease. When not in use, it becomes a smart, 
sculptural accent or paperweight. And we all know Pi can do 

everything. 



—- 








kr 




The Gentleman Geek 

An age-old problem that has plagued the mathematical commu- 
nity for years: how can I show off what a genius I am and still look 
devilishly handsome? Now we’ve all got the answer! Only a hand- 
ful of people can find the phrase ‘its e oclock’ funny and we know 
theyre currently reading this magazine. 


Head of theGlas 

Even though mathmos tend not to be the heaviest of drink- 
ers, when we do it, we do it in style. The mathematical 
symbols on these glasses are bound to thrill fellow geeks 
and pique the interests of numerical novices. Whether they 
join you with water during class or with something a little 
stronger as you celebrate cracking your latest conundrum, 
you 11 be glad you have your digits around. 
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D ascal found order 
in a game of dice. 

Can you find the patterns 
in chance events ? 


lf you can bring science to bear on the 
toughest challenges, apply here today for our 
Guantitative Analyst roles. 

w w w.g resea r ch .co.uk 

/predict-the-future 
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Compiled by Diana Danciu 


The Kiss Precise 

by Frederick Soddy 

For pairs of lips to kiss maybe 
Involves no trigonometry. 

£ Tis not so when four circles kiss 
Each one the other three. 

To bring this off the four must be 
As three in one or one in three. 

If one in three, beyond a doubt 
Each gets three kisses from without. 

If three in one, then is that one 
Thrice kissed internally. 

Four circles to the kissing come. 

The smaller are the benter. 

The bend is just the inverse of 
The distance from the center. 

Though their intrigue left Euclid dumb 
Theres now no need for rule of thumb. 
Since zero bends a dead straight line 
And concave bends have minus sign, 
The sum of the squares of all four bends 
Is half the square of their sum. 

To spy out spherical affairs 
An oscular surveyor 
Might fmd the task laborious, 

The sphere is much the gayer, 

And now besides the pair of pairs 
A fifth sphere in the kissing shares. 

Yet, signs and zero as before, 

For each to kiss the other four 
The square of the sum of all five bends 
Is thrice the sum of their squares. 

In Nature , June 20, 1936 
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Two mathmos were walking along the backs 
when one said, "Where did you get such a great 
bike?" 


The second mathmo replied, "Well, I was walk- 
ing along yesterday minding my own business 
when a beautiful woman rode up on this bike. 
She threw the bike to the ground, took off all her 
- clothes and said, "Take what you want." 

" The first mathmo nodded approvingly, "Good 
choice; the clothes probably wouldnt have fit. 


Q: Why did the chicken cross the road? 
A: The answer is trivial and is left as an 
exercise for the reader! 
























Two random variables were talking 
in a bar. They thought they were be- 
ing discrete but I heard their chatter 
continuously. 


Let there be a spherical cow. 


What does the B in Benoit B. Mandelbrot 
stand for? Benoit B. Mandelbrot! 


int qetRando^Nu^ber() 

l 

reiurn H' // chosen ty fair d/ce 

// gporanteed to be mno/o<n 




An engineer, a physicist and a mathematician 
are staying in a hotel. 


The engineer wakes up and smells smoke. He 
goes out into the hallway and sees a hre, so he 
hlls a trash can from his room with water and 
douses the hre. He goes back to bed. 


Later, the physicist wakes up and smells smoke. 
He opens his door and sees a hre in the hallway. 
He walks down the hall to a hre hose and after 
calculating the flame velocity, distance, water 
pressure, trajectory, etc. extinguishes the hre 
with the minimum amount of water and energy 
needed. 


Later, the mathematician wakes up and smells 
smoke. He goes to the hall, sees the hre and then 
the hre hose. He thinks for a moment and then 
exclaims, “Ah, a solution exists!” and then goes 
back to bed. 
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Lecturer Reviews 



Prof ImreLeader ★★★★★ 

Professor of Pure Mathematics, DPMMS 

Any introduction about this gem of the DPMMS and Trinity Col- 
lege is, pause, take a deep breath, unnecessary. I think every stu- 
dent in the whole world would agree that his hgure is best under- 
stood when seen in person, sipping coffee in a lecture and talking 
about induction. The fact that a thing such as The Imre Leader 
Appreciation Society exists speaks for itself and makes our argu- 
ment utterly trivial, so we should maybe stop praising His Majesty 
and mention a couple of things about his lecturing that you might 
not have noticed: 

Handwaving: (5/5) Mathematicians use the term 'hand- 
waving' when they can't quite make their arguments rigorous (at 
which point one starts to wonder how applied mathematicians' 
arms haven't come off yet) but Prof. Leader takes this expression to 
a whole new level. "Mumble mumble, and we're done!" 

Examples sheets: (5/5) No example sheet of Prof. Lead- 
er's is ever done by anyone completely. In fact, if someone did, I 
reckon that would earn him a Fields Medal or at least a substantial 
research grant. 

So after we have some coffee and stare at the results for a 
while, we conclude that Prof. Leader deserves nothing less than 5/5 
stars, affer which we are done, aren't we? End of proof / / □ 



ProfTom Korner ★★★★★ 

Professor of Fourier Analysis, DPMMS 

Theres one hgure you can see at every happy hour, a person also 
known to students as 'The King of the CMS', and thats Prof Korner. 
He is the lecturer to go and see if you want to hear some jokes 
along with the statutory theorems, lemmas and propositions. If 
you listen carefully, you’11 notice that he gives out life lessons on a 
regular basis, and that anybody who doesnt follow them without 
question may be considered a fool. You might have heard him say 
things like "trivial", "obvious" and 'Tm going to do something clev- 
er" a lot, but that s only because his intelligence far transcends ours 
and we just have to live with this fact and drink from his inhnite 
wisdom. For that very reason, he automatically gets 5/5 without 
any further discussion on the matter. 
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Dr Stephen Cowley ★★★★★ 

Chair of the Faculty of Mathematics, DAMTP 

Dr Cowley is well known for his love for Aying paper. In our hrst 
lecture he told us that we are allowed to throw paper planes at him 
with the proviso that if the planes flew for less than 10 seconds we 
had to go and pick them up. Of course, being such well-behaved 
students, lots of us listened to his advice and threw paper planes 
of increasing complexity, though I believe none ever flew for the 
required time period. Towards the end of term, it was Dr Cowley s 
turn: he would throw balls of paper at students who... lets say... 
were dreaming too loudly of mathematical equations. 

Dr Cowley is also a great believer in auditory learning: 
each line on a graph is given its own specihc sound: "Wee!", "Waa!", 
"Whoo!", and greek letters are always accompanied by an appro- 
priate animal noise even had it's own cow toy). After we engage 
brain for a second, and stop doing things the stupid way, we con- 
clude immediately that Dr Cowley gets a 5/5. 

Dr Piers Bursill-Hall ★★★★★ 

Researcher of the History of Mathematics, DPMMS 

From here on in, Dr Bursill-Hall will be referred to by his favour- 
ite title, "Our Merciful and Glorious leader" (OMAGL). Perhaps 
OMAGLs strongest talent as a lecturer is his ability to convey in- 
formation in as concise a manner as possible. Indeed, he often ends 
his lectures several hours early, having covered all of his intended 
material for the day. He is also highly respected for his strict adher- 
ence to the traditional rules of the lecture theatre. Food, even bot- 
tled water, is strictly forbidden under all circumstances, especially 
if the packets make rustling noises. Indeed, any noises from the au- 
dience are persecuted mercilessly; however, since such noises can 
be incredibly distracting, it is to our great delight when OMAGL 
gives an exclamation of "Oh *DO* Stop Coughing". It is absolutely 
vital to attend all History of Mathematics lectures every year, and 
also the supporting "History of Science for Mathmos" series. To 
help students realise the importance of this, OMAGL sends out 
thrice-weekly reminder emails, which usually go something like 
this: "HoM today, 4pm, usual place. Your Merciful and Glorious 
Leader". This is much appreciated by all of us, since a Cambridge 
maths student's favourite trick is to pretend that their eidetic mem- 
ory obviates the need for a calendar. Oh Merciful and Glorious 
Leader, how we worship you and bask in your glory! 5/5. 
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Solutions to the 
Problems Drive 

1 lnfinite Sequences 


2 Isometries 

e.g. X = {e m : n e N },f(z) = ze‘ 

3 Recursively ColouredTriangles 

3/4 

4 Primes 

(c) 

5 Subgroups 

8 

6 Random Numbers 

(b) 

7 Geometry 

1/6 

8 Spirals 

2 seconds 

11 Paths 

120 

12 Binary Numbers 

16383 

13 Probabilities 

http://en.wikipedia.org/wiki/Sleeping_Beauty. 

problem#Solutions 


14 Time for a Crossword 



110 111 



14 SomeCake 

16,21,26 


15 And some more Cake 

24 
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restarting next year. 
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countless amazing talks, great social 
events, discounts in our bookshop 
and three free copies of Eureka! 

Membership is only £5 per year or 
£10 for life. 

Email archim-eureka-secretary@ 
srcf.ucam.org for details, visit www. 
archim.org.uk or write to 

The Archimedeans 

Centre for Mathematical Sciences 

Wilberforce Road 

Cambridge, CB3 OWA 

United Kingdom 


Get lnvolved uiith Cureko 

lf you're interested in scientihc 
publishing and want to get 
involved with Eureka, we'd love 
to have you. Email archim-eureka- 
secretary@srcf.ucam.org.uk for 
role descriptions. 


UJrite for €ureko 

lf you want to contribute to 
future issues of Eureka, please 
email archim-eureka-secretary@ 
srcf.ucam.org. Further details 
can be found on our website. 
Author guidelines are contained 
on http://www.archim.org.uk/ 
eureka_author_guide.php. 
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